This module extends code contained in Coronavirus_Statistics_v005.Rmd to include sourcing of updated functions and parameters. This file includes the latest code for analyzing all-cause death data from CDC Weekly Deaths by Jurisdiction. CDC maintains data on deaths by week, age cohort, and state in the US. Downloaded data are unique by state, epidemiological week, year, age, and type (actual vs. predicted/projected).
These data are known to have a lag between death and reporting, and the CDC back-correct to report deaths at the time the death occurred even if the death is reported in following weeks. This means totals for recent weeks tend to run low (lag), and the CDC run a projection of the expected total number of deaths given the historical lag times. Per other analysts on the internet, there is currently significant supra-lag, with lag times much longer than historical averages causing CDC projected deaths for recent weeks to be low.
The code leverages tidyverse and sourced functions throughout:
# All functions assume that tidyverse and its components are loaded and available
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0 ✔ purrr 1.0.0
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.1 ✔ stringr 1.5.0
## ✔ readr 2.1.3 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
# If the same function is in both files, use the version from the more specific source
source("./Generic_Added_Utility_Functions_202105_v001.R")
source("./Coronavirus_CDC_Excess_Functions_v001.R")
The basic process includes three data update steps:
# STEP 0: Appropriate parameters for 2022 data
cdcExcessParams <- list(remapVars=c('Jurisdiction'='fullState',
'Week Ending Date'='weekEnding',
'State Abbreviation'='state',
'Age Group'='age',
'Number of Deaths'='deaths',
'Time Period'='period',
'Year'='year',
'Week'='week'
),
colTypes="ccciicdcccc",
ageLevels=c("Under 25 years",
"25-44 years",
"45-64 years",
"65-74 years",
"75-84 years",
"85 years and older"
),
periodLevels=c("2015-2019", "2020", "2021", "2022"),
periodKeep=c("2015-2019", "2020", "2021"),
yearLevels=2015:2022
)
# STEP 1: Latest CDC all-cause deaths data
cdcLoc <- "Weekly_counts_of_deaths_by_jurisdiction_and_age_group_downloaded_20220623.csv"
cdcList_20220623 <- readRunCDCAllCause(loc=cdcLoc,
weekThru=21,
lst=readFromRDS("cdc_daily_220602"),
stateNoCheck=c(),
pdfCluster=TRUE,
pdfAge=TRUE
)
##
## Parameter cvDeathThru has been set as: 2022-05-28
##
##
## *** Data suppression checks ***
##
## Rows in states to be checked that have NA deaths or a note for suppression:
## state weekEnding year week age
## 1 SD 2022-04-30 2022 17 65-74 years
## 2 SD 2022-04-30 2022 17 75-84 years
## Suppress deaths
## 1 Suppressed (counts highly incomplete, <50% of expected) NA
## 2 Suppressed (counts highly incomplete, <50% of expected) NA
##
##
## Problems by state:
## # A tibble: 1 x 5
## noCheck state problem n deaths
## <lgl> <chr> <lgl> <int> <dbl>
## 1 FALSE SD TRUE 2 NA
##
##
## There are 2 rows with errors; maximum for any given state is 2 errors
##
##
## Data suppression checks passed
##
##
## *** File has been checked for uniqueness by: state year week age
##
## Rows: 106,840
## Columns: 12
## $ fullState <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Ala~
## $ weekEnding <date> 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10~
## $ state <chr> "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL",~
## $ year <fct> 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,~
## $ week <int> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4,~
## $ age <fct> Under 25 years, 25-44 years, 45-64 years, 65-74 years, 75-8~
## $ period <fct> 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015~
## $ Type <chr> "Predicted (weighted)", "Predicted (weighted)", "Predicted ~
## $ Suppress <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ n <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,~
## $ deaths <dbl> 25, 67, 253, 202, 272, 320, 28, 49, 256, 222, 253, 332, 26,~
## $ Note <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
##
## Check Control Levels and Record Counts for Processed Data:
##
##
## Checking variable combination: age
## # A tibble: 6 x 4
## age n n_deaths_na deaths
## <fct> <dbl> <dbl> <dbl>
## 1 Under 25 years 12528 0 434501
## 2 25-44 years 16114 0 1115606
## 3 45-64 years 19554 0 4261157
## 4 65-74 years 19547 0 4306424
## 5 75-84 years 19554 0 5271898
## 6 85 years and older 19543 0 6662410
##
##
## Checking variable combination: period year Type
## # A tibble: 8 x 6
## period year Type n n_deaths_na deaths
## <fct> <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 2015 Predicted (weighted) 14367 0 2698242
## 2 2015-2019 2016 Predicted (weighted) 14445 0 2725557
## 3 2015-2019 2017 Predicted (weighted) 14408 0 2802070
## 4 2015-2019 2018 Predicted (weighted) 14400 0 2830373
## 5 2015-2019 2019 Predicted (weighted) 14413 0 2843917
## 6 2020 2020 Predicted (weighted) 14834 0 3432792
## 7 2021 2021 Predicted (weighted) 14698 0 3451431
## 8 2022 2022 Predicted (weighted) 5275 0 1267614
##
##
## Checking variable combination: period Suppress
## # A tibble: 4 x 5
## period Suppress n n_deaths_na deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 <NA> 72033 0 13900159
## 2 2020 <NA> 14834 0 3432792
## 3 2021 <NA> 14698 0 3451431
## 4 2022 <NA> 5275 0 1267614
##
##
## Checking variable combination: period Note
## # A tibble: 9 x 5
## period Note n n_deaths_na deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-20~ <NA> 72033 0 1.39e7
## 2 2020 Data in recent weeks are incomplete. Only ~ 279 0 8.68e4
## 3 2020 <NA> 14555 0 3.35e6
## 4 2021 Data in recent weeks are incomplete. Only ~ 12116 0 2.42e6
## 5 2021 Data in recent weeks are incomplete. Only ~ 10 0 2.58e2
## 6 2021 Data in recent weeks are incomplete. Only ~ 2572 0 1.04e6
## 7 2022 Data in recent weeks are incomplete. Only ~ 4347 0 1.06e6
## 8 2022 Data in recent weeks are incomplete. Only ~ 76 0 1.80e4
## 9 2022 Data in recent weeks are incomplete. Only ~ 852 0 1.90e5
##
## *** File has been checked for uniqueness by: cluster year week
##
## Plots will be run after excluding stateNoCheck states
##
## Detailed cluster summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_cluster_2022w21.pdf
##
## Returning plot outputs to the main log file
## Joining, by = "state"
##
## Detailed age summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_age_2022w21.pdf
##
## Returning plot outputs to the main log file
saveToRDS(cdcList_20220623, ovrWriteError=FALSE)
# STEP 2: Latest death bu location-cause data
allCause_220623 <- analyzeAllCause(loc="COvID_deaths_age_place_20220623.csv",
cdcDailyList=readFromRDS("cdc_daily_220602"),
compareThruDate="2022-05-31"
)
## `summarise()` has grouped output by 'State'. You can override using the `.groups` argument.
##
## States without abbreviations
## # A tibble: 2 x 10
## # Groups: State [2]
## State abb Year Month covidDeaths totalDeaths pneumoDeaths pneumoCovidDeat~
## <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl>
## 1 New Y~ <NA> 0 0 35136 170882 22567 13036
## 2 Puert~ <NA> 0 0 4311 78570 11023 3082
## # ... with 2 more variables: fluDeaths <dbl>, pnemoFluCovidDeaths <dbl>
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 1,748 x 12
## asofDate startDate endDate Group State deathPlace Age name dfSub
## <date> <date> <date> <chr> <chr> <chr> <chr> <chr> <dbl>
## 1 2022-06-02 2020-10-01 2020-10-31 By Mo~ Unite~ Total - All~ 30-3~ pnem~ 205
## 2 2022-06-02 2021-08-01 2021-08-31 By Mo~ Unite~ Other All ~ pneu~ 671
## 3 2022-06-02 2021-10-01 2021-10-31 By Mo~ Unite~ Decedent's ~ 40-4~ pnem~ 149
## 4 2022-06-02 2020-02-01 2020-02-29 By Mo~ Unite~ Total - All~ 30-3~ pnem~ 71
## 5 2022-06-02 2021-11-01 2021-11-30 By Mo~ Unite~ Healthcare ~ 75-8~ pnem~ 139
## 6 2022-06-02 2020-11-01 2020-11-30 By Mo~ Unite~ Total - All~ 30-3~ pneu~ 227
## 7 2022-06-02 2022-04-01 2022-04-30 By Mo~ Unite~ Total - All~ All ~ fluD~ 168
## 8 2022-06-02 2020-08-01 2020-08-31 By Mo~ Unite~ Other 0-17~ tota~ 116
## 9 2022-06-02 2020-09-01 2020-09-30 By Mo~ Unite~ Decedent's ~ 50-6~ pnem~ 190
## 10 2022-06-02 2021-10-01 2021-10-31 By Mo~ Unite~ Decedent's ~ 65-7~ pneu~ 86
## # ... with 1,738 more rows, and 3 more variables: dfTot <dbl>, delta <dbl>,
## # pct <dbl>
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 0 x 12
## # ... with 12 variables: asofDate <date>, startDate <date>, endDate <date>,
## # Group <chr>, State <chr>, deathPlace <chr>, Age <chr>, name <chr>,
## # dfSub <dbl>, dfTot <dbl>, delta <dbl>, pct <dbl>
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 0 x 12
## # ... with 12 variables: asofDate <date>, startDate <date>, endDate <date>,
## # Group <chr>, State <chr>, deathPlace <chr>, Age <chr>, name <chr>,
## # dfSub <dbl>, dfTot <dbl>, delta <dbl>, pct <dbl>
## # A tibble: 51 x 4
## abb cumValue tot_deaths pctdiff
## <chr> <dbl> <dbl> <dbl>
## 1 NY 36518 68346 0.304
## 2 DC 2010 1343 0.199
## 3 ND 2777 2283 0.0976
## 4 NC 28931 24660 0.0797
## 5 GA 32614 38198 0.0789
## 6 WY 1577 1820 0.0715
## 7 NE 4947 4290 0.0711
## 8 OH 43659 38628 0.0611
## 9 MI 32215 36357 0.0604
## 10 OK 16139 14420 0.0563
## # ... with 41 more rows
## # A tibble: 1 x 3
## cumValue tot_deaths pctdiff
## <dbl> <dbl> <dbl>
## 1 969868 997512 1.82
saveToRDS(allCause_220623, ovrWriteError=FALSE)
# STEP 3: Facets for excess all-cause deaths
excessDeathFacets(lstCDC=cdcList_20220623, lstAll=allCause_220623, dateThru="2022-04-30", plotYLim=c(-200, 1200))
Updated with the latest data:
# STEP 1: Latest CDC all-cause deaths data
cdcLoc <- "Weekly_counts_of_deaths_by_jurisdiction_and_age_group_downloaded_20220713.csv"
cdcList_20220713 <- readRunCDCAllCause(loc=cdcLoc,
weekThru=24,
lst=readFromRDS("cdc_daily_220704"),
stateNoCheck=c(),
pdfCluster=TRUE,
pdfAge=TRUE
)
##
## Parameter cvDeathThru has been set as: 2022-06-18
##
##
## *** Data suppression checks ***
##
## Rows in states to be checked that have NA deaths or a note for suppression:
## [1] state weekEnding year week age Suppress deaths
## <0 rows> (or 0-length row.names)
##
##
## Problems by state:
## # A tibble: 0 x 5
## # ... with 5 variables: noCheck <lgl>, state <chr>, problem <lgl>, n <int>,
## # deaths <dbl>
## Warning in max(.): no non-missing arguments to max; returning -Inf
##
##
## There are 0 rows with errors; maximum for any given state is -Inf errors
##
##
## Data suppression checks passed
##
##
## *** File has been checked for uniqueness by: state year week age
##
## Rows: 108,099
## Columns: 12
## $ fullState <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Ala~
## $ weekEnding <date> 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10~
## $ state <chr> "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL",~
## $ year <fct> 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,~
## $ week <int> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4,~
## $ age <fct> Under 25 years, 25-44 years, 45-64 years, 65-74 years, 75-8~
## $ period <fct> 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015~
## $ Type <chr> "Predicted (weighted)", "Predicted (weighted)", "Predicted ~
## $ Suppress <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ n <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,~
## $ deaths <dbl> 25, 67, 253, 202, 272, 320, 28, 49, 256, 222, 253, 332, 26,~
## $ Note <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
##
## Check Control Levels and Record Counts for Processed Data:
##
##
## Checking variable combination: age
## # A tibble: 6 x 4
## age n n_deaths_na deaths
## <fct> <dbl> <dbl> <dbl>
## 1 Under 25 years 12543 0 432096
## 2 25-44 years 16323 0 1118247
## 3 45-64 years 19812 0 4307809
## 4 65-74 years 19806 0 4368517
## 5 75-84 years 19813 0 5351113
## 6 85 years and older 19802 0 6752462
##
##
## Checking variable combination: period year Type
## # A tibble: 8 x 6
## period year Type n n_deaths_na deaths
## <fct> <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 2015 Predicted (weighted) 14367 0 2698242
## 2 2015-2019 2016 Predicted (weighted) 14445 0 2725557
## 3 2015-2019 2017 Predicted (weighted) 14408 0 2802070
## 4 2015-2019 2018 Predicted (weighted) 14400 0 2830373
## 5 2015-2019 2019 Predicted (weighted) 14413 0 2843917
## 6 2020 2020 Predicted (weighted) 14834 0 3432816
## 7 2021 2021 Predicted (weighted) 14702 0 3450646
## 8 2022 2022 Predicted (weighted) 6530 0 1546623
##
##
## Checking variable combination: period Suppress
## # A tibble: 4 x 5
## period Suppress n n_deaths_na deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 <NA> 72033 0 13900159
## 2 2020 <NA> 14834 0 3432816
## 3 2021 <NA> 14702 0 3450646
## 4 2022 <NA> 6530 0 1546623
##
##
## Checking variable combination: period Note
## # A tibble: 9 x 5
## period Note n n_deaths_na deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-20~ <NA> 72033 0 1.39e7
## 2 2020 Data in recent weeks are incomplete. Only ~ 279 0 8.69e4
## 3 2020 <NA> 14555 0 3.35e6
## 4 2021 Data in recent weeks are incomplete. Only ~ 13990 0 3.20e6
## 5 2021 Data in recent weeks are incomplete. Only ~ 15 0 4.01e2
## 6 2021 Data in recent weeks are incomplete. Only ~ 697 0 2.51e5
## 7 2022 Data in recent weeks are incomplete. Only ~ 1058 0 1.61e5
## 8 2022 Data in recent weeks are incomplete. Only ~ 86 0 7.94e3
## 9 2022 Data in recent weeks are incomplete. Only ~ 5386 0 1.38e6
##
## *** File has been checked for uniqueness by: cluster year week
##
## Plots will be run after excluding stateNoCheck states
##
## Detailed cluster summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_cluster_2022w24.pdf
##
## Returning plot outputs to the main log file
## Joining, by = "state"
##
## Detailed age summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_age_2022w24.pdf
##
## Returning plot outputs to the main log file
saveToRDS(cdcList_20220713, ovrWriteError=FALSE)
# STEP 2: Latest death bu location-cause data
allCause_220713 <- analyzeAllCause(loc="COvID_deaths_age_place_20220713.csv",
cdcDailyList=readFromRDS("cdc_daily_220704"),
compareThruDate="2022-06-30"
)
## `summarise()` has grouped output by 'State'. You can override using the `.groups` argument.
##
## States without abbreviations
## # A tibble: 2 x 10
## # Groups: State [2]
## State abb Year Month covidDeaths totalDeaths pneumoDeaths pneumoCovidDeat~
## <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl>
## 1 New Y~ <NA> 0 0 35270 174129 22877 13064
## 2 Puert~ <NA> 0 0 4459 80624 11310 3179
## # ... with 2 more variables: fluDeaths <dbl>, pnemoFluCovidDeaths <dbl>
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 1,818 x 12
## asofDate startDate endDate Group State deathPlace Age name dfSub
## <date> <date> <date> <chr> <chr> <chr> <chr> <chr> <dbl>
## 1 2022-07-06 2020-10-01 2020-10-31 By Mo~ Unite~ Total - All~ 30-3~ pnem~ 205
## 2 2022-07-06 2021-10-01 2021-10-31 By Mo~ Unite~ Decedent's ~ 40-4~ pnem~ 150
## 3 2022-07-06 2020-02-01 2020-02-29 By Mo~ Unite~ Total - All~ 30-3~ pnem~ 71
## 4 2022-07-06 2021-11-01 2021-11-30 By Mo~ Unite~ Healthcare ~ 75-8~ pnem~ 139
## 5 2022-07-06 2022-04-01 2022-04-30 By Mo~ Unite~ Total - All~ All ~ fluD~ 184
## 6 2022-07-06 2020-11-01 2020-11-30 By Mo~ Unite~ Total - All~ 30-3~ pneu~ 227
## 7 2022-07-06 2021-08-01 2021-08-31 By Mo~ Unite~ Other All ~ pneu~ 627
## 8 2022-07-06 2022-06-01 2022-06-30 By Mo~ Unite~ Decedent's ~ 85 y~ pneu~ 183
## 9 2022-07-06 2020-01-01 2022-07-02 By To~ Unite~ Total - All~ 0-17~ fluD~ 50
## 10 2022-07-06 2020-01-01 2022-07-02 By To~ Unite~ Total - All~ 30-3~ fluD~ 200
## # ... with 1,808 more rows, and 3 more variables: dfTot <dbl>, delta <dbl>,
## # pct <dbl>
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 0 x 12
## # ... with 12 variables: asofDate <date>, startDate <date>, endDate <date>,
## # Group <chr>, State <chr>, deathPlace <chr>, Age <chr>, name <chr>,
## # dfSub <dbl>, dfTot <dbl>, delta <dbl>, pct <dbl>
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 0 x 12
## # ... with 12 variables: asofDate <date>, startDate <date>, endDate <date>,
## # Group <chr>, State <chr>, deathPlace <chr>, Age <chr>, name <chr>,
## # dfSub <dbl>, dfTot <dbl>, delta <dbl>, pct <dbl>
## # A tibble: 51 x 4
## abb cumValue tot_deaths pctdiff
## <chr> <dbl> <dbl> <dbl>
## 1 NY 36925 69007 0.303
## 2 DC 1994 1351 0.192
## 3 WY 1462 1834 0.113
## 4 ND 2802 2296 0.0993
## 5 GA 32661 38579 0.0831
## 6 NC 29438 25211 0.0773
## 7 MI 32104 36918 0.0697
## 8 NE 4986 4342 0.0690
## 9 AZ 26808 30515 0.0647
## 10 OH 44034 38852 0.0625
## # ... with 41 more rows
## # A tibble: 1 x 3
## cumValue tot_deaths pctdiff
## <dbl> <dbl> <dbl>
## 1 974598 1008140 1.91
## Warning: Removed 8 rows containing missing values (geom_col).
## Warning: Removed 8 rows containing missing values (geom_col).
saveToRDS(allCause_220713, ovrWriteError=FALSE)
# STEP 3: Facets for excess all-cause deaths
excessDeathFacets(lstCDC=cdcList_20220713, lstAll=allCause_220713, dateThru="2022-05-31", plotYLim=c(-200, 1200))
There have been issues with US all-cause deaths data since a “systems upgrade” in mid-June. How much restatement of data has occurred?
# Mapping file of epiweek and epiyear to date
mapEpi <- tibble::tibble(date=seq.Date(as.Date("2014-12-01"), as.Date("2031-01-31"), by=1)) %>%
mutate(epiYear=as.integer(lubridate::epiyear(date)), epiWeek=as.integer(lubridate::epiweek(date)))
nameFile <- "ageAgg"
dfCheck <- bind_rows(readFromRDS("cdcList_20220713")[[nameFile]],
readFromRDS("cdcList_20220623")[[nameFile]],
readFromRDS("cdcList_20220105")[[nameFile]],
.id="fileDate"
) %>%
mutate(fileDate=c("1"="2022-07-13", "2"="2022-06-23", "3"="2022-01-05")[fileDate])
mapEpi %>%
arrange(date) %>%
group_by(epiYear, epiWeek) %>%
filter(row_number()==1) %>%
ungroup() %>%
rename(yearint=epiYear, week=epiWeek) %>%
right_join(dfCheck, by=c("yearint", "week")) %>%
ggplot(aes(x=date, y=deaths)) +
geom_line(aes(color=fileDate, group=fileDate)) +
lims(y=c(0, NA)) +
labs(x=NULL, y="Reported all-cause US deaths", title="US all-cause deaths by report date") +
facet_wrap(~age, scales="free_y")
Data appear anomalous, particularly 2022 deaths in “Under 25 years” and “25-44 years”. Partly, this is incomplete reporting in the most recent weeks (normal), but partly this may be driven by data not yet re-entered after the upgrade. It is striking that there are fewer reported all-cause deaths in the 2022-07-13 data than in the 2022-06-23 data for any cohort, as all-cause data almost always increases as additional reports are received from vital statistics departments. Trends among “45-64 years” and senior citizens, at a glance, are the more commonly observed build over time
The process is converted to functional form:
makeRestatementData <- function(vecFiles, key, vecNames=NULL, epiRange=as.Date(c("2014-12-01", "2031-01-31"))) {
# FUNCTION ARGUMENTS:
# vecFiles: character vector of file names (will be extracted using readFromRDS)
# key: the extract element from each of the lists
# vecNames: names to be used in plot for each of the extracts (NULL means infer from ...)
# epiRange: range for converting epiweek and epiyear to date (should be a larger range than data)
# Add names to vecNames if not passed
if(!is.null(vecNames) & is.null(names(vecNames)))
vecNames <- vecNames %>% purrr::set_names(as.character(1:length(vecFiles)))
# Create keyNames if not provided
if(is.null(vecNames)) {
vecNames <- as.character(lubridate::ymd(stringr::str_remove(vecFiles, ".*_"))) %>%
purrr::set_names(as.character(1:length(vecFiles)))
}
# Create epi mapping file
dfEpi <- tibble::tibble(date=seq.Date(epiRange[1], epiRange[2], by=1)) %>%
mutate(epiYear=as.integer(lubridate::epiyear(date)),
epiWeek=as.integer(lubridate::epiweek(date))
)
# Create single date for each epiWeek and epiYear
mapEpi <- dfEpi %>%
arrange(date) %>%
group_by(epiYear, epiWeek) %>%
filter(row_number()==1) %>%
ungroup() %>%
rename(yearint=epiYear, week=epiWeek)
# Read and integrate file, add epiDate
purrr::map_dfr(.x=vecFiles,
.f=function(x) readFromRDS(x)[[key]],
.id="fileDate"
) %>%
mutate(fileDate=vecNames[fileDate]) %>%
left_join(mapEpi, by=c("yearint", "week"))
}
plotRestatementData <- function(df, wrapBy=NULL, asRatio=FALSE) {
# FUNCTION ARGUMENTS:
# df: data frame or tibble formatted for plotting
# wrapBy: variable for facet_wrap (NULL means infer from file, FALSE means do not wrap)
# asRatio: boolean, should ratios be plotted rather than values?
# Create the appropriate wrapBy if passed as NULL
if(is.null(wrapBy)) {
if("age" %in% names(df)) wrapBy <- "age"
else if ("state" %in% names(df)) wrapBy <- "state"
else if ("cluster" %in% names(df)) wrapBy <- "cluster"
else wrapBy <- FALSE
}
plotTitle <- "US all-cause deaths by report date"
plotSubTitle <- NULL
plotYAxis <- "Reported all-cause US deaths"
# Create ratios if appropriate
if(isTRUE(asRatio)) {
groupVars <- c("date")
if(!isFALSE(wrapBy)) groupVars <- c(groupVars, wrapBy)
df <- df %>%
rename(trueFileDate=fileDate, trueDeaths=deaths) %>%
arrange(trueFileDate) %>%
group_by_at(all_of(groupVars)) %>%
mutate(n=n(),
fileDate=ifelse(row_number()==1, trueFileDate, paste0(trueFileDate, " vs. ", lag(trueFileDate))),
deaths=ifelse(row_number()==1, trueDeaths, trueDeaths/lag(trueDeaths))
) %>%
ungroup()
plotTitle <- "Ratio of US all-cause deaths by report date"
plotSubTitle <- "Ratios filtered to exclude NA and results greater than 3"
plotYAxis <- "Ratio of reported all-cause US deaths"
}
# Create base plot
p1 <- df %>%
filter(if(isTRUE(asRatio)) fileDate != min(fileDate) else TRUE) %>%
filter(if(isTRUE(asRatio)) !is.na(deaths) & deaths <= 3 else TRUE) %>%
ggplot(aes(x=date, y=deaths)) +
geom_line(aes(color=fileDate, group=fileDate)) +
lims(y=c(0, NA)) +
labs(x=NULL, y=plotYAxis, subtitle=plotSubTitle, title=plotTitle) +
scale_color_discrete("File Date")
# Add line at 1.0 if ratio
if(isTRUE(asRatio)) p1 <- p1 + geom_hline(yintercept=1, lty=2)
# Add facetting if appropriate
if(!isFALSE(wrapBy)) p1 <- p1 + facet_wrap(~get(wrapBy), scales="free_y")
# Print the plot
print(p1)
}
makeRestatementData(c("cdcList_20220713", "cdcList_20220623", "cdcList_20220105"), key="ageAgg")
## # A tibble: 6,810 × 12
## fileDate age year week deaths weekfct yearint pred delta cumDe…¹ cumPred
## <chr> <fct> <fct> <int> <dbl> <fct> <int> <dbl> <dbl> <dbl> <dbl>
## 1 2022-07… Unde… 2015 1 1069 1 2015 1143. -74.4 -74.4 1143.
## 2 2022-07… Unde… 2016 1 1067 1 2016 1122. -55.0 -55.0 1122.
## 3 2022-07… Unde… 2017 1 1147 1 2017 1101. 46.4 46.4 1101.
## 4 2022-07… Unde… 2018 1 1185 1 2018 1079. 106. 106. 1079.
## 5 2022-07… Unde… 2019 1 1035 1 2019 1058. -22.8 -22.8 1058.
## 6 2022-07… Unde… 2020 1 1101 1 2020 1036. 64.6 64.6 1036.
## 7 2022-07… Unde… 2021 1 1072 1 2021 1015. 57.0 57.0 1015.
## 8 2022-07… Unde… 2022 1 931 1 2022 994. -62.6 -62.6 994.
## 9 2022-07… Unde… 2015 2 1103 2 2015 1133. -30.0 -104. 2276.
## 10 2022-07… Unde… 2016 2 1068 2 2016 1112. -43.6 -98.6 2234.
## # … with 6,800 more rows, 1 more variable: date <date>, and abbreviated
## # variable name ¹cumDelta
makeRestatementData(c("cdcList_20220713", "cdcList_20220623", "cdcList_20220105"), key="ageAgg") %>%
plotRestatementData()
makeRestatementData(c("cdcList_20220713", "cdcList_20220623", "cdcList_20220105"), key="ageAgg") %>%
plotRestatementData(asRatio=TRUE)
## Warning: Using `all_of()` outside of a selecting function was deprecated in tidyselect
## 1.2.0.
## ℹ See details at
## <https://tidyselect.r-lib.org/reference/faq-selection-context.html>
makeRestatementData(c("cdcList_20220713", "cdcList_20220623", "cdcList_20220105"), key="allUSAgg") %>%
plotRestatementData()
makeRestatementData(c("cdcList_20220713", "cdcList_20220623", "cdcList_20220105"), key="allUSAgg") %>%
plotRestatementData(asRatio=TRUE)
Github user USMortality stores archived all-cause deaths data. The file from 2022 week 17 is downloaded and processed:
# STEP 1: Archived CDC all-cause deaths data
cdcLoc <- "Weekly_counts_of_deaths_by_jurisdiction_and_age_group_2022_17.txt"
cdcList_arch_2022w17 <- readRunCDCAllCause(loc=cdcLoc,
weekThru=16,
lst=readFromRDS("cdc_daily_220704"),
stateNoCheck=c(),
pdfCluster=TRUE,
pdfAge=TRUE
)
##
## Parameter cvDeathThru has been set as: 2022-04-23
##
##
## *** Data suppression checks ***
##
## Rows in states to be checked that have NA deaths or a note for suppression:
## state weekEnding year week age
## 1 NE 2022-04-23 2022 16 65-74 years
## 2 NE 2022-04-23 2022 16 75-84 years
## 3 NE 2022-04-23 2022 16 85 years and older
## 4 IN 2022-04-16 2022 15 25-44 years
## 5 IN 2022-04-16 2022 15 45-64 years
## 6 IN 2022-04-16 2022 15 65-74 years
## 7 IN 2022-04-16 2022 15 75-84 years
## 8 IN 2022-04-16 2022 15 85 years and older
## Suppress deaths
## 1 Suppressed (counts highly incomplete, <50% of expected) NA
## 2 Suppressed (counts highly incomplete, <50% of expected) NA
## 3 Suppressed (counts highly incomplete, <50% of expected) NA
## 4 Suppressed (counts highly incomplete, <50% of expected) NA
## 5 Suppressed (counts highly incomplete, <50% of expected) NA
## 6 Suppressed (counts highly incomplete, <50% of expected) NA
## 7 Suppressed (counts highly incomplete, <50% of expected) NA
## 8 Suppressed (counts highly incomplete, <50% of expected) NA
##
##
## Problems by state:
## # A tibble: 2 x 5
## noCheck state problem n deaths
## <lgl> <chr> <lgl> <int> <dbl>
## 1 FALSE IN TRUE 5 NA
## 2 FALSE NE TRUE 3 NA
##
##
## There are 8 rows with errors; maximum for any given state is 5 errors
##
##
## Data suppression checks passed
##
##
## *** File has been checked for uniqueness by: state year week age
##
## Rows: 105,996
## Columns: 12
## $ fullState <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Ala~
## $ weekEnding <date> 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10~
## $ state <chr> "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL",~
## $ year <fct> 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,~
## $ week <int> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4,~
## $ age <fct> Under 25 years, 25-44 years, 45-64 years, 65-74 years, 75-8~
## $ period <fct> 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015~
## $ Type <chr> "Predicted (weighted)", "Predicted (weighted)", "Predicted ~
## $ Suppress <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ n <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,~
## $ deaths <dbl> 25, 67, 253, 202, 272, 320, 28, 49, 256, 222, 253, 332, 26,~
## $ Note <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
##
## Check Control Levels and Record Counts for Processed Data:
##
##
## Checking variable combination: age
## # A tibble: 6 x 4
## age n n_deaths_na deaths
## <fct> <dbl> <dbl> <dbl>
## 1 Under 25 years 12422 0 430722
## 2 25-44 years 15982 0 1105179
## 3 45-64 years 19401 0 4228337
## 4 65-74 years 19397 0 4270304
## 5 75-84 years 19403 0 5227671
## 6 85 years and older 19391 0 6612949
##
##
## Checking variable combination: period year Type
## # A tibble: 8 x 6
## period year Type n n_deaths_na deaths
## <fct> <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 2015 Predicted (weighted) 14367 0 2698242
## 2 2015-2019 2016 Predicted (weighted) 14445 0 2725557
## 3 2015-2019 2017 Predicted (weighted) 14408 0 2802070
## 4 2015-2019 2018 Predicted (weighted) 14400 0 2830373
## 5 2015-2019 2019 Predicted (weighted) 14413 0 2843917
## 6 2020 2020 Predicted (weighted) 14834 0 3432787
## 7 2021 2021 Predicted (weighted) 14696 0 3452019
## 8 2022 2022 Predicted (weighted) 4433 0 1090197
##
##
## Checking variable combination: period Suppress
## # A tibble: 4 x 5
## period Suppress n n_deaths_na deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 <NA> 72033 0 13900159
## 2 2020 <NA> 14834 0 3432787
## 3 2021 <NA> 14696 0 3452019
## 4 2022 <NA> 4433 0 1090197
##
##
## Checking variable combination: period Note
## # A tibble: 8 x 5
## period Note n n_deaths_na deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-20~ <NA> 72033 0 1.39e7
## 2 2020 Data in recent weeks are incomplete. Only ~ 279 0 8.68e4
## 3 2020 <NA> 14555 0 3.35e6
## 4 2021 Data in recent weeks are incomplete. Only ~ 12124 0 2.39e6
## 5 2021 Data in recent weeks are incomplete. Only ~ 2572 0 1.06e6
## 6 2022 Data in recent weeks are incomplete. Only ~ 3310 0 8.36e5
## 7 2022 Data in recent weeks are incomplete. Only ~ 77 0 1.76e4
## 8 2022 Data in recent weeks are incomplete. Only ~ 1046 0 2.37e5
##
## *** File has been checked for uniqueness by: cluster year week
##
## Plots will be run after excluding stateNoCheck states
##
## Detailed cluster summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_cluster_2022w16.pdf
##
## Returning plot outputs to the main log file
## Joining, by = "state"
##
## Detailed age summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_age_2022w16.pdf
##
## Returning plot outputs to the main log file
saveToRDS(cdcList_arch_2022w17, ovrWriteError=FALSE)
Comparisons can be run among deaths in each dataset:
makeRestatementData(c("cdcList_20220713", "cdcList_arch_2022w17", "cdcList_20220105"),
key="allUSAgg",
vecNames=c("2022-07-13", "2022-04-25", "2022-01-05")
) %>%
plotRestatementData(asRatio=TRUE)
makeRestatementData(c("cdcList_20220713", "cdcList_arch_2022w17", "cdcList_20220105"),
key="ageAgg",
vecNames=c("2022-07-13", "2022-04-25", "2022-01-05")
) %>%
plotRestatementData(asRatio=TRUE)
makeRestatementData(c("cdcList_20220713", "cdcList_arch_2022w17", "cdcList_20220105"),
key="clusterAgg",
vecNames=c("2022-07-13", "2022-04-25", "2022-01-05")
) %>%
plotRestatementData(asRatio=TRUE)
The persistent gap between reported deaths in 2022-01-03 and later reports is the exclusion of several cluster 5 states from the 2022-01-03 processing due to data suppression issues. There continues to be an anomaly where deaths among people under age 45 decreased between 2022-04-25 and 2022-07-13. This trend of decreasing deaths is significantly reduced or not existent in data for ages 45+
Data prior to exclusions are examined for consistency:
dfCheck <- readFromRDS("cdcList_arch_2022w17")$cdc %>%
select(state, weekEnding, age, deaths_220425=deaths) %>%
full_join(readFromRDS("cdcList_20220713")$cdc %>% select(state, weekEnding, age, deaths_220713=deaths),
by=c("state", "weekEnding", "age")
) %>%
mutate(delta=ifelse(is.na(deaths_220713), 0, deaths_220713)-ifelse(is.na(deaths_220425), 0, deaths_220425),
neg=(delta < 0)
)
dfCheck
## # A tibble: 108,190 × 7
## state weekEnding age deaths_220425 deaths_220713 delta neg
## <chr> <date> <fct> <dbl> <dbl> <dbl> <lgl>
## 1 AL 2015-01-10 Under 25 years 25 25 0 FALSE
## 2 AL 2015-01-10 25-44 years 67 67 0 FALSE
## 3 AL 2015-01-10 45-64 years 253 253 0 FALSE
## 4 AL 2015-01-10 65-74 years 202 202 0 FALSE
## 5 AL 2015-01-10 75-84 years 272 272 0 FALSE
## 6 AL 2015-01-10 85 years and older 320 320 0 FALSE
## 7 AL 2015-01-17 Under 25 years 28 28 0 FALSE
## 8 AL 2015-01-17 25-44 years 49 49 0 FALSE
## 9 AL 2015-01-17 45-64 years 256 256 0 FALSE
## 10 AL 2015-01-17 65-74 years 222 222 0 FALSE
## # … with 108,180 more rows
dfCheck %>% count(neg)
## # A tibble: 2 × 2
## neg n
## <lgl> <int>
## 1 FALSE 104486
## 2 TRUE 3704
# Get counts of changes by state
dfCheck %>%
group_by(state) %>%
summarize(nNeg=sum(neg), negDelta=sum(delta*neg), n=n(), .groups="drop") %>%
ggplot(aes(x=fct_reorder(state, negDelta), y=negDelta)) +
geom_col(fill="lightblue") +
geom_text(aes(label=negDelta), hjust=1) +
coord_flip() +
labs(y="Sum of negative changes in weekly deaths by age group from 2022-04-25 to 2022-07-13",
x=NULL,
title="Negative change in weekly death by state summary"
)
# Examples overall
dfCheck %>%
select(-delta, -neg) %>%
pivot_longer(starts_with("deaths")) %>%
group_by(weekEnding, age, name) %>%
summarize(deaths=specNA()(value), .groups="drop") %>%
ggplot(aes(x=weekEnding, y=deaths)) +
geom_line(aes(group=name, color=name)) +
lims(y=c(0, NA)) +
labs(x=NULL, y="Reported deaths", title="Reported deaths by age group and week in US") +
facet_wrap(~age, scales="free_y")
## Warning: Removed 8 rows containing missing values (`geom_line()`).
# Examples from Florida (biggest change)
dfCheck %>%
filter(state=="FL") %>%
select(-delta, -neg) %>%
pivot_longer(starts_with("deaths")) %>%
ggplot(aes(x=weekEnding, y=value)) +
geom_line(aes(group=name, color=name)) +
lims(y=c(0, NA)) +
labs(x=NULL, y="Reported deaths", title="Reported deaths by age group and week in Florida") +
facet_wrap(~age, scales="free_y")
## Warning: Removed 8 rows containing missing values (`geom_line()`).
Florida data shows similarities to the national data, with negative restatements and negative recent trends primarily limited to the 0-44 years buckets.
Each state and age group is assessed for the total amount of negative delta relative to the average number of annual deaths in the group:
dfCheckAvg <- dfCheck %>%
group_by(state, age) %>%
summarize(across(starts_with("deaths"), specNA(mean)),
delta=specNA(sum)(ifelse(neg, delta, 0)),
.groups="drop"
) %>%
mutate(deltaRatio=delta/deaths_220425)
dfCheckAvg %>%
ggplot(aes(x=fct_reorder(state, deltaRatio, min), y=deltaRatio)) +
geom_col(fill="lightblue") +
geom_text(aes(y=deltaRatio/2, label=round(deltaRatio, 1))) +
coord_flip() +
facet_wrap(~age, nrow=1) +
labs(title="Total negative restatement", subtitle="Units are average number of weeks", y="Avg weeks", x=NULL)
bigDelta <- c("CO", "AZ", "SC", "FL", "OK", "VT")
dfCheck %>%
mutate(type=ifelse(state %in% bigDelta, "big delta", "all other")) %>%
group_by(weekEnding, type, age) %>%
summarize(across(starts_with("deaths"), specNA(sum)), .groups="drop") %>%
mutate(daynum=1L+7*as.integer(weekEnding-min(weekEnding))) %>%
mutate(pred=predict(lm(deaths_220425 ~ daynum*type*age, data=., subset=lubridate::year(weekEnding)<=2019),
newdata=.
)
) %>%
select(-daynum) %>%
pivot_longer(-c(weekEnding, type, age, pred)) %>%
ggplot(aes(x=weekEnding, y=value)) +
geom_line(aes(group=name, color=name)) +
geom_line(aes(y=pred), lty=2, lwd=0.5) +
lims(y=c(0, NA)) +
labs(title="Weekly deaths by state type",
subtitle="Big delta states: CO, AZ, SC, FL, OK, VT\nDashed line is simple linear model using 2015-2019 data",
x=NULL,
y=NULL
) +
facet_grid(type~age, scales="free_y")
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## Warning: Removed 8 rows containing missing values (`geom_line()`).
Much of the negative restatement is driven by a handful of states. There remains a general pattern of deaths, especially among younger groups, falling below historical trends in the most recent data
Plots are created as ratios vs. expected (trend from 2015-2019):
dfCheck %>%
mutate(type=ifelse(state %in% bigDelta, "big delta", "all other")) %>%
group_by(weekEnding, type, age) %>%
summarize(across(starts_with("deaths"), specNA(sum)), .groups="drop") %>%
mutate(daynum=1L+7*as.integer(weekEnding-min(weekEnding))) %>%
mutate(pred=predict(lm(deaths_220425 ~ daynum*type*age, data=., subset=lubridate::year(weekEnding)<=2019),
newdata=.
)
) %>%
select(-daynum) %>%
pivot_longer(-c(weekEnding, type, age, pred)) %>%
ggplot(aes(x=weekEnding, y=value/pred)) +
geom_line(aes(group=name, color=name)) +
geom_line(aes(y=1), lty=2, lwd=0.5) +
lims(y=c(0, NA)) +
labs(title="Ratio of weekly deaths vs. 2015-2019 trend by state type",
subtitle="Big delta states: CO, AZ, SC, FL, OK, VT",
x=NULL,
y=NULL
) +
facet_grid(type~age, scales="free_y")
## Warning: Removed 8 rows containing missing values (`geom_line()`).
Reported deaths in recent weeks in the “Under 25 years” bucket are under 50% of trends using a simple linear model on 2015-2019 data. The “25-44 years” bucket is ~25% under trend, while the remaining buckets are near trend.
The process is converted to functional form:
calculateRestatementFromRaw <- function(lst1Name,
lst2Name,
lstLabels,
labelBase=FALSE
) {
# FUNCTION ARGUMENTS
# lst1Name: character name (for readFromRDS) of first list that includes raw CDC data
# lst2Name: character name (for readFromRDS) of second list that includes raw CDC data
# lstLabels: labels to be used for list data (e.g., c("deaths_220425", "deaths_220713"))
# labelBase: boolean, should a convenience column "base" be created from lst1Name for later summarization?
# Create the data
df <- readFromRDS(lst1Name)$cdc %>%
select(state, weekEnding, age, deaths) %>%
colRenamer(c("deaths"="deaths1")) %>%
full_join(readFromRDS(lst2Name)$cdc %>%
select(state, weekEnding, age, deaths) %>%
colRenamer(c("deaths"="deaths2")),
by=c("state", "weekEnding", "age")
) %>%
mutate(delta=ifelse(is.na(deaths2), 0, deaths2)-ifelse(is.na(deaths1), 0, deaths1),
neg=(delta < 0)
)
# Add the base column if requested
if(isTRUE(labelBase)) df <- df %>% mutate(base=deaths1)
# Rename and return the data
df %>%
colRenamer(c("deaths1"=lstLabels[1], "deaths2"=lstLabels[2]))
}
identical(dfCheck,
calculateRestatementFromRaw("cdcList_arch_2022w17",
"cdcList_20220713",
lstLabels=c("deaths_220425", "deaths_220713")
)
)
## [1] TRUE
plotRestatementFromRaw <- function(df,
varNegTotal=c(),
fnDateStack=NULL,
timePeriod=NULL,
makeProp=FALSE,
makePropYears=NULL
) {
# FUNCTION ARGUMENTS:
# df: data frame from calculateRestatementFromRaw
# varNegTotal: variables that should be plotted for sum of negative restatement
# fnDateStack: function to apply to weekEnding for stacking data (NULL means none)
# timePeriod: character vector of time period of two data sources (NULL means infer from variable names)
# makeProp: boolean, should proportional deaths be shown?
# makePropYears: integer vector of years to include in proportional chart (NULL means latest year in data)
if(is.null(timePeriod)) {
timePeriod <- df %>%
select(starts_with("deaths_")) %>%
names() %>%
str_remove(pattern="deaths_") %>%
lubridate::ymd() %>%
as.character() %>%
paste0(collapse=" data to ")
timePeriod <- paste0("from ", timePeriod, " data")
}
# Get counts of changes by varNegTotal
for (keyVar in varNegTotal) {
# Set up data for stacking
if(!is.null(fnDateStack)) {
dfPlot <- df %>%
mutate(stackVar=fnDateStack(weekEnding))
keyVar <- c(keyVar, "stackVar")
} else {
dfPlot <- df
}
# Create the totals
dfTot <- dfPlot %>%
group_by_at(all_of(keyVar[keyVar != "stackVar"])) %>%
summarize(negDelta=sum(delta*neg), .groups="drop")
# Set up base plot and labels
p1 <- dfPlot %>%
group_by_at(all_of(keyVar)) %>%
summarize(nNeg=sum(neg), negDelta=sum(delta*neg), n=n(), .groups="drop") %>%
ggplot(aes(x=fct_reorder(get(keyVar[keyVar != "stackVar"]), negDelta), y=negDelta)) +
geom_text(data=dfTot, aes(label=negDelta), hjust=1) +
coord_flip() +
labs(y=paste0("Sum of negative changes in weekly deaths ", timePeriod),
x=NULL,
title=paste0("Negative change in weekly death by ", keyVar[keyVar != "stackVar"])
)
# Add the columns (either basic or stacked)
if(is.null(fnDateStack)) p1 <- p1 + geom_col(fill="lightblue")
else p1 <- p1 + geom_col(aes(fill=stackVar), position="stack")
# Print the plot
print(p1)
# Create proportional plot if requested
if(isTRUE(makeProp)) {
# Get the year if passed as NULL
if(is.null(makePropYears)) makePropYears <- max(lubridate::year(dfPlot$weekEnding))
# Create the plot
p2 <- dfPlot %>%
filter(lubridate::year(weekEnding) %in% all_of(makePropYears)) %>%
mutate(deltaNeg=ifelse(neg, delta, 0)) %>%
group_by_at(all_of(keyVar[keyVar != "stackVar"])) %>%
summarize(across(where(is.numeric), sum, na.rm=TRUE)) %>%
mutate(pctNeg=deltaNeg/base) %>%
ggplot(aes(x=fct_reorder(get(keyVar[keyVar != "stackVar"]), pctNeg), y=pctNeg)) +
geom_col(fill="lightblue") +
geom_text(aes(y=pctNeg/2, label=paste0(round(100*pctNeg, 1), "%"))) +
coord_flip() +
labs(title=paste0("Proportion of ",
paste0(makePropYears, collapse="-"),
" deaths negatively restated ",
timePeriod
),
y=NULL,
x=NULL
)
# Print the plot
print(p2)
}
}
}
calculateRestatementFromRaw("cdcList_20220105",
"cdcList_arch_2022w17",
lstLabels=c("deaths_220105", "deaths_220425"),
labelBase=TRUE
) %>%
plotRestatementFromRaw(varNegTotal=c("state", "age"), makeProp=TRUE, makePropYears=2021)
calculateRestatementFromRaw("cdcList_arch_2022w17",
"cdcList_20220713",
lstLabels=c("deaths_220425", "deaths_220713"),
labelBase=TRUE
) %>%
plotRestatementFromRaw(varNegTotal=c("state", "age"), makeProp=TRUE, makePropYears=2022)
# Create function for custom quarter-year
tmpCustomQuarter <- function(x)
ifelse(lubridate::year(x)==2022, paste0(lubridate::year(x), "-Q", lubridate::quarter(x)), lubridate::year(x))
calculateRestatementFromRaw("cdcList_arch_2022w17",
"cdcList_20220713",
lstLabels=c("deaths_220425", "deaths_220713")
) %>%
plotRestatementFromRaw(varNegTotal=c("state", "age"),
fnDateStack=tmpCustomQuarter
)
Between the 2022-01-05 data and the 2022-04-25 data, negative restatements were 559 (104+455) among people under the age of 45. Between the 2022-04-25 data and the 2022-07-13 data, negative restatements were 11,236 (3,820 + 7,416) among people under the age of 45. The majority of the 2022-07-13 vs 2022-04-25 restatements are in 2022-Q1 data, and proportionally the younger population is much more heavily restated than the older population
Functions are run on data from previous years:
calculateRestatementFromRaw("cdcList_20210911",
"cdcList_20211203",
lstLabels=c("deaths_210911", "deaths_211203"),
labelBase=TRUE
) %>%
plotRestatementFromRaw(varNegTotal=c("state", "age"), makeProp=TRUE, makePropYears=2021)
# Create function for custom quarter-year
tmpCustomQuarter <- function(x, keyYear=2022)
ifelse(lubridate::year(x) %in% all_of(keyYear),
paste0(lubridate::year(x), "-Q", lubridate::quarter(x)),
lubridate::year(x)
)
calculateRestatementFromRaw("cdcList_20210911",
"cdcList_20211203",
lstLabels=c("deaths_210911", "deaths_211203")
) %>%
plotRestatementFromRaw(varNegTotal=c("state", "age"),
fnDateStack=function(x) tmpCustomQuarter(x, keyYear=2021)
)
Negative restatement of data was much less common, particularly among people under age 45, during a 3-month time period selected from 2021
Plots are made for the percentage of negative restatement by state and week for a specified age group:
# Create the basic frame
df_u45 <- calculateRestatementFromRaw("cdcList_arch_2022w17",
"cdcList_20220713",
lstLabels=c("deaths_220425", "deaths_220713"),
labelBase=TRUE
) %>%
filter(weekEnding >= as.Date("2021-12-01"),
!is.na(deaths_220425)
)
# Get the list of states in every week
u45States <- df_u45 %>% count(state) %>% filter(n>=max(n)-1) %>% pull(state)
u45States
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MD" "MI" "NC" "NY" "OH" "OK" "PA" "SC"
## [16] "TN" "TX" "VA" "WA" "WI"
# Create the plot (Under age 45)
df_u45 %>%
filter(state %in% all_of(u45States), age %in% c("Under 25 years", "25-44 years")) %>%
mutate(chg=delta/base) %>%
ggplot(aes(x=fct_reorder(state, chg, .fun=function(x) sum(ifelse(x<=0, x, 0))), y=weekEnding)) +
geom_tile(aes(fill=chg)) +
coord_flip() +
scale_fill_gradient2("Pct Restated", low="red", high="green", midpoint=0) +
labs(y="Week",
x=NULL,
title="Restatement of deaths in 2022-07-13 data vs. 2022-04-25 data",
subtitle="States missing at most 2 weeks for Under 25 and 25-44 years in each data set"
) +
facet_wrap(~age)
# Create the plot (All ages)
df_u45 %>%
filter(state %in% all_of(u45States)) %>%
mutate(chg=delta/base) %>%
ggplot(aes(x=fct_reorder(state, chg, .fun=function(x) sum(ifelse(x<=0, x, 0))), y=weekEnding)) +
geom_tile(aes(fill=chg)) +
coord_flip() +
scale_fill_gradient2("Pct Restated", low="red", high="green", midpoint=0) +
labs(y="Week",
x=NULL,
title="Restatement of deaths in 2022-07-13 data vs. 2022-04-25 data",
subtitle="States missing at most 2 weeks for any age group"
) +
facet_wrap(~age)
There are significant differences in the amount of negative restatement by state and week in 2022, driven by “Under 25” and “25-44”. Restatements are generally modest to nonexistent in the December 2021 data and among people over age 45
Specific states are explored:
allAges <- unique(readFromRDS("cdcList_arch_2022w17")$cdc$age)
allWeeks <- seq.Date(as.Date("2015-01-10"), as.Date("2029-12-31"), by="7 days")
allAgeWeek <- tibble::tibble(date=rep(allWeeks, times=length(allAges)), age=rep(allAges, each=length(allWeeks)))
allAgeWeek
## # A tibble: 4,692 × 2
## date age
## <date> <fct>
## 1 2015-01-10 Under 25 years
## 2 2015-01-17 Under 25 years
## 3 2015-01-24 Under 25 years
## 4 2015-01-31 Under 25 years
## 5 2015-02-07 Under 25 years
## 6 2015-02-14 Under 25 years
## 7 2015-02-21 Under 25 years
## 8 2015-02-28 Under 25 years
## 9 2015-03-07 Under 25 years
## 10 2015-03-14 Under 25 years
## # … with 4,682 more rows
# Example for Florida from pre-update data
readFromRDS("cdcList_arch_2022w17")$cdc %>%
filter(state=="FL") %>%
full_join(allAgeWeek %>% filter(date <= "2022-04-23"), by=c("weekEnding"="date", "age")) %>%
mutate(deaths=ifelse(is.na(deaths), 0, deaths)) %>%
ggplot(aes(x=weekEnding, y=deaths)) +
geom_line() +
lims(y=c(0, NA)) +
geom_smooth(data=~filter(., weekEnding <= "2020-01-01"), method="lm", fullrange=TRUE) +
geom_smooth(color="red") +
facet_wrap(~age, scales="free_y") +
labs(x=NULL,
y="Weekly all-cause deaths",
title="Weekly all-cause deaths in FL (2022-04-23 data)",
subtitle="Blue smooth is linear model based on 2015-2019 data"
)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
# Example for Colorado from post-update data
readFromRDS("cdcList_20220713")$cdc %>%
filter(state=="CO") %>%
full_join(allAgeWeek %>% filter(date <= "2022-06-18"), by=c("weekEnding"="date", "age")) %>%
mutate(deaths=ifelse(is.na(deaths), 0, deaths)) %>%
ggplot(aes(x=weekEnding, y=deaths)) +
geom_line() +
lims(y=c(0, NA)) +
geom_smooth(data=~filter(., weekEnding <= "2020-01-01"), method="lm", fullrange=TRUE) +
geom_smooth(color="red") +
facet_wrap(~age, scales="free_y") +
labs(x=NULL,
y="Weekly all-cause deaths",
title="Weekly all-cause deaths in CO (2022-06-18 data)",
subtitle="Blue smooth is linear model based on 2015-2019 data"
)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Slight downturns in the most recent weeks are normal, as there is a several-week build of deaths in the CDC database (lag in death reporting from states and counties). There are unexplained downturns in early-mid 2022 that were not observed in pre-update iterations of the data
The process is converted to functional form:
plotCDCRestatement <- function(lstOld,
lstNew,
lstLabels,
minDate=NULL,
maxDate=NULL,
useStates=NULL,
maxMissingAllowed=2,
lstFilter=list(),
lstExclude=list(),
createStatePlot=TRUE,
restateYears=c(),
restateQuarterYears=c(),
returnData=!isTRUE(createStatePlot)
) {
# FUNCTION ARGUMENTS:
# lstOld: name for the old CDC data list (character name of the list for readFromRDS)
# lstNew: name for the new CDC data list (character name of the list for readFromRDS)
# lstLabels: character vector of length 2 giving the plotting names for the lists
# minDate: include only data on or after this date (NULL means include all)
# maxDate: include only data on or before this date (NULL means include all)
# useStates: states to be included in plotting (NULL means create from data and maxMissingAllowed)
# maxMissingAllowed: maximum number of missing records allowed to include state
# lstFilter: a list for filtering records, of form list("field"=c("allowed values"))
# lstExclude: a list for filtering records, of form list("field"=c("disallowed values"))
# createStatePlot: should the restatement plot be created?
# restateYears: years for which restatement plots should be created
# restateQuarterYears: years for which quarter should be broken out
# returnData: boolean, should the plotting data frame be returned?
# Create the basic frame
df <- calculateRestatementFromRaw(lstOld, lstNew, lstLabels=lstLabels, labelBase=TRUE)
if(!is.null(minDate)) df <- df %>% filter(weekEnding >= as.Date(minDate))
if(!is.null(maxDate)) df <- df %>% filter(weekEnding <= as.Date(maxDate))
# Get the list of states in every week
if(is.null(useStates)) {
useStates <- df %>%
filter(!is.na(get(lstLabels[1]))) %>%
count(state) %>%
filter(n>=max(n)-maxMissingAllowed) %>%
pull(state)
}
cat("\n", length(useStates), "states will be included:", paste0(useStates, collapse=", "), "\n")
print(c(lstFilter, list("age"=useStates)))
# Create the plot
if(isTRUE(createStatePlot)) {
p1 <- df %>%
filter(!is.na(get(lstLabels[1]))) %>%
rowFilter(lstFilter=c(lstFilter, list("state"=useStates)), lstExclude=lstExclude) %>%
mutate(chg=delta/base) %>%
ggplot(aes(x=fct_reorder(state, chg, .fun=function(x) sum(ifelse(x<=0, x, 0))), y=weekEnding)) +
geom_tile(aes(fill=chg)) +
coord_flip() +
scale_fill_gradient2("Pct Restated", low="red", high="green", midpoint=0) +
labs(y="Week",
x=NULL,
title=paste0("Restatement of deaths in ",
lubridate::ymd(stringr::str_extract(lstLabels[1], pattern="\\d{6}")),
" data vs. ",
lubridate::ymd(stringr::str_extract(lstLabels[2], pattern="\\d{6}")),
" data"
),
subtitle="Select states meeting minimum data availability threshold"
) +
facet_wrap(~age)
print(p1)
}
# Create restatement by year plots
for(curYear in restateYears) {
calculateRestatementFromRaw(lstOld, lstNew, lstLabels=lstLabels, labelBase=TRUE) %>%
plotRestatementFromRaw(varNegTotal=c("state", "age"), makeProp=TRUE, makePropYears=curYear)
}
# Create restatement by quarter plots
if(length(restateQuarterYears) > 0) {
calculateRestatementFromRaw(lstOld, lstNew, lstLabels=lstLabels) %>%
plotRestatementFromRaw(varNegTotal=c("state", "age"),
fnDateStack=function(x) tmpCustomQuarter(x, keyYear=restateQuarterYears)
)
}
# Return the data if requested
if(isTRUE(returnData)) return(df)
}
# Check that data are the same
all.equal(dfCheck,
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20220713",
lstLabels=c("deaths_220425", "deaths_220713"),
createStatePlot=FALSE
) %>%
select(-base)
)
##
## 17 states will be included: AL, AZ, CA, CO, FL, GA, IL, MI, NC, NY, OH, PA, SC, TN, TX, VA, WA
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MI" "NC" "NY" "OH" "PA" "SC" "TN" "TX"
## [16] "VA" "WA"
## [1] TRUE
# Plot for all ages
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20220713",
lstLabels=c("deaths_220425", "deaths_220713"),
minDate="2021-12-01"
)
##
## 20 states will be included: AL, AZ, CA, CO, FL, GA, IL, MD, MI, NC, NY, OH, OK, PA, SC, TN, TX, VA, WA, WI
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MD" "MI" "NC" "NY" "OH" "OK" "PA" "SC"
## [16] "TN" "TX" "VA" "WA" "WI"
# Plot for under 45
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20220713",
lstLabels=c("deaths_220425", "deaths_220713"),
lstFilter=list("age"=c("Under 25 years", "25-44 years")),
minDate="2021-12-01"
)
##
## 20 states will be included: AL, AZ, CA, CO, FL, GA, IL, MD, MI, NC, NY, OH, OK, PA, SC, TN, TX, VA, WA, WI
## $age
## [1] "Under 25 years" "25-44 years"
##
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MD" "MI" "NC" "NY" "OH" "OK" "PA" "SC"
## [16] "TN" "TX" "VA" "WA" "WI"
# Yearly and quarterly plots
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20220713",
lstLabels=c("deaths_220425", "deaths_220713"),
createStatePlot=FALSE,
restateYears=2022,
restateQuarterYears=2021:2022
)
##
## 17 states will be included: AL, AZ, CA, CO, FL, GA, IL, MI, NC, NY, OH, PA, SC, TN, TX, VA, WA
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MI" "NC" "NY" "OH" "PA" "SC" "TN" "TX"
## [16] "VA" "WA"
## # A tibble: 108,190 × 8
## state weekEnding age deaths_220425 deaths_…¹ delta neg base
## <chr> <date> <fct> <dbl> <dbl> <dbl> <lgl> <dbl>
## 1 AL 2015-01-10 Under 25 years 25 25 0 FALSE 25
## 2 AL 2015-01-10 25-44 years 67 67 0 FALSE 67
## 3 AL 2015-01-10 45-64 years 253 253 0 FALSE 253
## 4 AL 2015-01-10 65-74 years 202 202 0 FALSE 202
## 5 AL 2015-01-10 75-84 years 272 272 0 FALSE 272
## 6 AL 2015-01-10 85 years and older 320 320 0 FALSE 320
## 7 AL 2015-01-17 Under 25 years 28 28 0 FALSE 28
## 8 AL 2015-01-17 25-44 years 49 49 0 FALSE 49
## 9 AL 2015-01-17 45-64 years 256 256 0 FALSE 256
## 10 AL 2015-01-17 65-74 years 222 222 0 FALSE 222
## # … with 108,180 more rows, and abbreviated variable name ¹deaths_220713
The process is updated with the latest data:
# STEP 1: Latest CDC all-cause deaths data
cdcLoc <- "Weekly_counts_of_deaths_by_jurisdiction_and_age_group_downloaded_20220925.csv"
cdcList_20220925 <- readRunCDCAllCause(loc=cdcLoc,
weekThru=24,
lst=readFromRDS("cdc_daily_220902"),
stateNoCheck=c(),
pdfCluster=TRUE,
pdfAge=TRUE
)
##
## Parameter cvDeathThru has been set as: 2022-06-18
##
##
## *** Data suppression checks ***
##
## Rows in states to be checked that have NA deaths or a note for suppression:
## [1] state weekEnding year week age Suppress deaths
## <0 rows> (or 0-length row.names)
##
##
## Problems by state:
## # A tibble: 0 × 5
## # … with 5 variables: noCheck <lgl>, state <chr>, problem <lgl>, n <int>,
## # deaths <dbl>
## # ℹ Use `colnames()` to see all variable names
## Warning in max(.): no non-missing arguments to max; returning -Inf
##
##
## There are 0 rows with errors; maximum for any given state is -Inf errors
##
##
## Data suppression checks passed
##
##
## *** File has been checked for uniqueness by: state year week age
##
## Rows: 108,335
## Columns: 12
## $ fullState <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Ala…
## $ weekEnding <date> 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10…
## $ state <chr> "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL",…
## $ year <fct> 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,…
## $ week <int> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4,…
## $ age <fct> Under 25 years, 25-44 years, 45-64 years, 65-74 years, 75-8…
## $ period <fct> 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015…
## $ Type <chr> "Predicted (weighted)", "Predicted (weighted)", "Predicted …
## $ Suppress <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ n <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ deaths <dbl> 25, 67, 253, 202, 272, 320, 28, 49, 256, 222, 253, 332, 26,…
## $ Note <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
##
## Check Control Levels and Record Counts for Processed Data:
##
##
## Checking variable combination: age
## # A tibble: 6 × 4
## age n n_deaths_na deaths
## <fct> <dbl> <dbl> <dbl>
## 1 Under 25 years 12739 0 442479
## 2 25-44 years 16359 0 1135808
## 3 45-64 years 19815 0 4320034
## 4 65-74 years 19807 0 4366909
## 5 75-84 years 19813 0 5343810
## 6 85 years and older 19802 0 6741337
##
##
## Checking variable combination: period year Type
## # A tibble: 8 × 6
## period year Type n n_deaths_na deaths
## <fct> <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 2015 Predicted (weighted) 14367 0 2698242
## 2 2015-2019 2016 Predicted (weighted) 14445 0 2725557
## 3 2015-2019 2017 Predicted (weighted) 14408 0 2802070
## 4 2015-2019 2018 Predicted (weighted) 14400 0 2830373
## 5 2015-2019 2019 Predicted (weighted) 14413 0 2843917
## 6 2020 2020 Predicted (weighted) 14834 0 3432822
## 7 2021 2021 Predicted (weighted) 14700 0 3450828
## 8 2022 2022 Predicted (weighted) 6768 0 1566568
##
##
## Checking variable combination: period Suppress
## # A tibble: 4 × 5
## period Suppress n n_deaths_na deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 <NA> 72033 0 13900159
## 2 2020 <NA> 14834 0 3432822
## 3 2021 <NA> 14700 0 3450828
## 4 2022 <NA> 6768 0 1566568
##
##
## Checking variable combination: period Note
## # A tibble: 11 × 5
## period Note n n_dea…¹ deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 <NA> 72033 0 1.39e7
## 2 2020 Data in recent weeks are incomplete. Only 60%… 279 0 8.69e4
## 3 2020 <NA> 14555 0 3.35e6
## 4 2021 Data in recent weeks are incomplete. Only 60%… 14138 0 3.20e6
## 5 2021 Data in recent weeks are incomplete. Only 60%… 11 0 3.15e3
## 6 2021 Data in recent weeks are incomplete. Only 60%… 10 0 2.66e2
## 7 2021 Data in recent weeks are incomplete. Only 60%… 541 0 2.52e5
## 8 2022 Data in recent weeks are incomplete. Only 60%… 6027 0 1.33e6
## 9 2022 Data in recent weeks are incomplete. Only 60%… 274 0 6.90e4
## 10 2022 Data in recent weeks are incomplete. Only 60%… 9 0 2.73e2
## 11 2022 Data in recent weeks are incomplete. Only 60%… 458 0 1.72e5
## # … with abbreviated variable name ¹n_deaths_na
##
## *** File has been checked for uniqueness by: cluster year week
##
## Plots will be run after excluding stateNoCheck states
##
## Detailed cluster summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_cluster_2022w24.pdf
##
## Returning plot outputs to the main log file
## Joining, by = "state"
##
## Detailed age summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_age_2022w24.pdf
##
## Returning plot outputs to the main log file
saveToRDS(cdcList_20220925, ovrWriteError=FALSE)
# STEP 2: Latest death by location-cause data
allCause_220925 <- analyzeAllCause(loc="COvID_deaths_age_place_20220925.csv",
cdcDailyList=readFromRDS("cdc_daily_220902"),
compareThruDate="2022-08-25"
)
## `summarise()` has grouped output by 'State'. You can override using the
## `.groups` argument.
##
## States without abbreviations
## # A tibble: 2 × 10
## # Groups: State [2]
## State abb Year Month covid…¹ total…² pneum…³ pneum…⁴ fluDe…⁵ pnemo…⁶
## <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 New York Ci… <NA> 0 0 36032 188073 23909 13276 1076 46890
## 2 Puerto Rico <NA> 0 0 5092 89479 12556 3578 198 14241
## # … with abbreviated variable names ¹covidDeaths, ²totalDeaths, ³pneumoDeaths,
## # ⁴pneumoCovidDeaths, ⁵fluDeaths, ⁶pnemoFluCovidDeaths
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 1,992 × 12
## asofDate startDate endDate Group State death…¹ Age name dfSub dfTot
## <date> <date> <date> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 2022-09-21 2022-01-01 2022-09-17 By Ye… Unit… Decede… All … fluD… 103 308
## 2 2022-09-21 2020-10-01 2020-10-31 By Mo… Unit… Total … 30-3… pnem… 205 401
## 3 2022-09-21 2021-10-01 2021-10-31 By Mo… Unit… Decede… 40-4… pnem… 150 346
## 4 2022-09-21 2020-02-01 2020-02-29 By Mo… Unit… Total … 30-3… pnem… 71 261
## 5 2022-09-21 2021-11-01 2021-11-30 By Mo… Unit… Health… 75-8… pnem… 139 329
## 6 2022-09-21 2022-04-01 2022-04-30 By Mo… Unit… Total … All … fluD… 215 404
## 7 2022-09-21 2020-11-01 2020-11-30 By Mo… Unit… Total … 30-3… pneu… 227 413
## 8 2022-09-21 2021-08-01 2021-08-31 By Mo… Unit… Other All … pneu… 627 812
## 9 2022-09-21 2020-08-01 2020-08-31 By Mo… Unit… Other 0-17… tota… 116 297
## 10 2022-09-21 2020-09-01 2020-09-30 By Mo… Unit… Decede… 50-6… pnem… 190 370
## # … with 1,982 more rows, 2 more variables: delta <dbl>, pct <dbl>, and
## # abbreviated variable name ¹deathPlace
## # ℹ Use `print(n = ...)` to see more rows, and `colnames()` to see all variable names
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 0 × 12
## # … with 12 variables: asofDate <date>, startDate <date>, endDate <date>,
## # Group <chr>, State <chr>, deathPlace <chr>, Age <chr>, name <chr>,
## # dfSub <dbl>, dfTot <dbl>, delta <dbl>, pct <dbl>
## # ℹ Use `colnames()` to see all variable names
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 0 × 12
## # … with 12 variables: asofDate <date>, startDate <date>, endDate <date>,
## # Group <chr>, State <chr>, deathPlace <chr>, Age <chr>, name <chr>,
## # dfSub <dbl>, dfTot <dbl>, delta <dbl>, pct <dbl>
## # ℹ Use `colnames()` to see all variable names
## # A tibble: 0 × 4
## # … with 4 variables: abb <chr>, cumValue <dbl>, tot_deaths <dbl>,
## # pctdiff <dbl>
## # ℹ Use `colnames()` to see all variable names
## # A tibble: 1 × 3
## cumValue tot_deaths pctdiff
## <dbl> <dbl> <dbl>
## 1 0 0 0
saveToRDS(allCause_220925, ovrWriteError=FALSE)
# STEP 3: Facets for excess all-cause deaths
excessDeathFacets(lstCDC=cdcList_20220925, lstAll=allCause_220925, dateThru="2022-08-31", plotYLim=c(-200, 1200))
Restatement is also assessed:
# Plot for all ages
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20220925",
lstLabels=c("deaths_220425", "deaths_220925"),
minDate="2021-12-01"
)
##
## 20 states will be included: AL, AZ, CA, CO, FL, GA, IL, MD, MI, NC, NY, OH, OK, PA, SC, TN, TX, VA, WA, WI
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MD" "MI" "NC" "NY" "OH" "OK" "PA" "SC"
## [16] "TN" "TX" "VA" "WA" "WI"
# Plot for under 45
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20220925",
lstLabels=c("deaths_220425", "deaths_220925"),
lstFilter=list("age"=c("Under 25 years", "25-44 years")),
minDate="2021-12-01"
)
##
## 20 states will be included: AL, AZ, CA, CO, FL, GA, IL, MD, MI, NC, NY, OH, OK, PA, SC, TN, TX, VA, WA, WI
## $age
## [1] "Under 25 years" "25-44 years"
##
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MD" "MI" "NC" "NY" "OH" "OK" "PA" "SC"
## [16] "TN" "TX" "VA" "WA" "WI"
# Yearly and quarterly plots
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20220925",
lstLabels=c("deaths_220425", "deaths_220925"),
createStatePlot=FALSE,
restateYears=2022,
restateQuarterYears=2021:2022
)
##
## 17 states will be included: AL, AZ, CA, CO, FL, GA, IL, MI, NC, NY, OH, PA, SC, TN, TX, VA, WA
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MI" "NC" "NY" "OH" "PA" "SC" "TN" "TX"
## [16] "VA" "WA"
## # A tibble: 108,342 × 8
## state weekEnding age deaths_220425 deaths_…¹ delta neg base
## <chr> <date> <fct> <dbl> <dbl> <dbl> <lgl> <dbl>
## 1 AL 2015-01-10 Under 25 years 25 25 0 FALSE 25
## 2 AL 2015-01-10 25-44 years 67 67 0 FALSE 67
## 3 AL 2015-01-10 45-64 years 253 253 0 FALSE 253
## 4 AL 2015-01-10 65-74 years 202 202 0 FALSE 202
## 5 AL 2015-01-10 75-84 years 272 272 0 FALSE 272
## 6 AL 2015-01-10 85 years and older 320 320 0 FALSE 320
## 7 AL 2015-01-17 Under 25 years 28 28 0 FALSE 28
## 8 AL 2015-01-17 25-44 years 49 49 0 FALSE 49
## 9 AL 2015-01-17 45-64 years 256 256 0 FALSE 256
## 10 AL 2015-01-17 65-74 years 222 222 0 FALSE 222
## # … with 108,332 more rows, and abbreviated variable name ¹deaths_220925
## # ℹ Use `print(n = ...)` to see more rows
There are still some restatement issues relative to older data, though the very large percentage declines previously observed in persons under age 45 are no longer evident
The process is updated with the latest data:
# STEP 1: Latest CDC all-cause deaths data
cdcLoc <- "Weekly_counts_of_deaths_by_jurisdiction_and_age_group_downloaded_20221021.csv"
cdcList_20221021 <- readRunCDCAllCause(loc=cdcLoc,
weekThru=28,
lst=readFromRDS("cdc_daily_221002"),
stateNoCheck=c(),
pdfCluster=TRUE,
pdfAge=TRUE
)
##
## Parameter cvDeathThru has been set as: 2022-07-16
##
##
## *** Data suppression checks ***
##
## Rows in states to be checked that have NA deaths or a note for suppression:
## [1] state weekEnding year week age Suppress deaths
## <0 rows> (or 0-length row.names)
##
##
## Problems by state:
## # A tibble: 0 × 5
## # … with 5 variables: noCheck <lgl>, state <chr>, problem <lgl>, n <int>,
## # deaths <dbl>
## # ℹ Use `colnames()` to see all variable names
## Warning in max(.): no non-missing arguments to max; returning -Inf
##
##
## There are 0 rows with errors; maximum for any given state is -Inf errors
##
##
## Data suppression checks passed
##
##
## *** File has been checked for uniqueness by: state year week age
##
## Rows: 109,471
## Columns: 12
## $ fullState <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Ala…
## $ weekEnding <date> 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10…
## $ state <chr> "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL",…
## $ year <fct> 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,…
## $ week <int> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4,…
## $ age <fct> Under 25 years, 25-44 years, 45-64 years, 65-74 years, 75-8…
## $ period <fct> 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015…
## $ Type <chr> "Predicted (weighted)", "Predicted (weighted)", "Predicted …
## $ Suppress <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ n <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ deaths <dbl> 25, 67, 253, 202, 272, 320, 28, 49, 256, 222, 253, 332, 26,…
## $ Note <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
##
## Check Control Levels and Record Counts for Processed Data:
##
##
## Checking variable combination: age
## # A tibble: 6 × 4
## age n n_deaths_na deaths
## <fct> <dbl> <dbl> <dbl>
## 1 Under 25 years 12878 0 447623
## 2 25-44 years 16540 0 1149977
## 3 45-64 years 20019 0 4363550
## 4 65-74 years 20011 0 4413911
## 5 75-84 years 20017 0 5401704
## 6 85 years and older 20006 0 6806568
##
##
## Checking variable combination: period year Type
## # A tibble: 8 × 6
## period year Type n n_deaths_na deaths
## <fct> <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 2015 Predicted (weighted) 14367 0 2698242
## 2 2015-2019 2016 Predicted (weighted) 14445 0 2725557
## 3 2015-2019 2017 Predicted (weighted) 14408 0 2802070
## 4 2015-2019 2018 Predicted (weighted) 14400 0 2830373
## 5 2015-2019 2019 Predicted (weighted) 14413 0 2843917
## 6 2020 2020 Predicted (weighted) 14834 0 3432822
## 7 2021 2021 Predicted (weighted) 14703 0 3450867
## 8 2022 2022 Predicted (weighted) 7901 0 1799485
##
##
## Checking variable combination: period Suppress
## # A tibble: 4 × 5
## period Suppress n n_deaths_na deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 <NA> 72033 0 13900159
## 2 2020 <NA> 14834 0 3432822
## 3 2021 <NA> 14703 0 3450867
## 4 2022 <NA> 7901 0 1799485
##
##
## Checking variable combination: period Note
## # A tibble: 12 × 5
## period Note n n_dea…¹ deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 <NA> 72033 0 1.39e7
## 2 2020 Data in recent weeks are incomplete. Only 60%… 279 0 8.69e4
## 3 2020 <NA> 14555 0 3.35e6
## 4 2021 Data in recent weeks are incomplete. Only 60%… 14133 0 3.19e6
## 5 2021 Data in recent weeks are incomplete. Only 60%… 11 0 3.15e3
## 6 2021 Data in recent weeks are incomplete. Only 60%… 19 0 5.02e2
## 7 2021 Data in recent weeks are incomplete. Only 60%… 540 0 2.53e5
## 8 2022 Data in recent weeks are incomplete. Only 60%… 7067 0 1.52e6
## 9 2022 Data in recent weeks are incomplete. Only 60%… 292 0 7.17e4
## 10 2022 Data in recent weeks are incomplete. Only 60%… 4 0 1.23e2
## 11 2022 Data in recent weeks are incomplete. Only 60%… 509 0 1.97e5
## 12 2022 Data in recent weeks are incomplete. Only 60%… 29 0 7.84e3
## # … with abbreviated variable name ¹n_deaths_na
##
## *** File has been checked for uniqueness by: cluster year week
##
## Plots will be run after excluding stateNoCheck states
##
## Detailed cluster summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_cluster_2022w28.pdf
##
## Returning plot outputs to the main log file
## Joining, by = "state"
##
## Detailed age summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_age_2022w28.pdf
##
## Returning plot outputs to the main log file
saveToRDS(cdcList_20221021, ovrWriteError=FALSE)
# STEP 2: Latest death by location-cause data
allCause_221021 <- analyzeAllCause(loc="COvID_deaths_age_place_20221021.csv",
cdcDailyList=readFromRDS("cdc_daily_221002"),
compareThruDate="2022-09-22"
)
## `summarise()` has grouped output by 'State'. You can override using the
## `.groups` argument.
##
## States without abbreviations
## # A tibble: 2 × 10
## # Groups: State [2]
## State abb Year Month covid…¹ total…² pneum…³ pneum…⁴ fluDe…⁵ pnemo…⁶
## <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 New York Ci… <NA> 0 0 36217 192384 24214 13317 1079 47342
## 2 Puerto Rico <NA> 0 0 5246 92229 12935 3683 204 14673
## # … with abbreviated variable names ¹covidDeaths, ²totalDeaths, ³pneumoDeaths,
## # ⁴pneumoCovidDeaths, ⁵fluDeaths, ⁶pnemoFluCovidDeaths
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 2,041 × 12
## asofDate startDate endDate Group State death…¹ Age name dfSub dfTot
## <date> <date> <date> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 2022-10-19 2022-01-01 2022-10-15 By Ye… Unit… Decede… All … fluD… 118 321
## 2 2022-10-19 2020-10-01 2020-10-31 By Mo… Unit… Total … 30-3… pnem… 205 401
## 3 2022-10-19 2021-10-01 2021-10-31 By Mo… Unit… Decede… 40-4… pnem… 150 346
## 4 2022-10-19 2020-02-01 2020-02-29 By Mo… Unit… Total … 30-3… pnem… 71 261
## 5 2022-10-19 2021-11-01 2021-11-30 By Mo… Unit… Health… 75-8… pnem… 139 329
## 6 2022-10-19 2022-04-01 2022-04-30 By Mo… Unit… Total … All … fluD… 215 405
## 7 2022-10-19 2020-11-01 2020-11-30 By Mo… Unit… Total … 30-3… pneu… 227 413
## 8 2022-10-19 2021-08-01 2021-08-31 By Mo… Unit… Other All … pneu… 627 812
## 9 2022-10-19 2020-08-01 2020-08-31 By Mo… Unit… Other 0-17… tota… 116 297
## 10 2022-10-19 2022-01-01 2022-10-15 By Ye… Unit… Health… 0-17… pneu… 108 289
## # … with 2,031 more rows, 2 more variables: delta <dbl>, pct <dbl>, and
## # abbreviated variable name ¹deathPlace
## # ℹ Use `print(n = ...)` to see more rows, and `colnames()` to see all variable names
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 0 × 12
## # … with 12 variables: asofDate <date>, startDate <date>, endDate <date>,
## # Group <chr>, State <chr>, deathPlace <chr>, Age <chr>, name <chr>,
## # dfSub <dbl>, dfTot <dbl>, delta <dbl>, pct <dbl>
## # ℹ Use `colnames()` to see all variable names
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 0 × 12
## # … with 12 variables: asofDate <date>, startDate <date>, endDate <date>,
## # Group <chr>, State <chr>, deathPlace <chr>, Age <chr>, name <chr>,
## # dfSub <dbl>, dfTot <dbl>, delta <dbl>, pct <dbl>
## # ℹ Use `colnames()` to see all variable names
## # A tibble: 0 × 4
## # … with 4 variables: abb <chr>, cumValue <dbl>, tot_deaths <dbl>,
## # pctdiff <dbl>
## # ℹ Use `colnames()` to see all variable names
## # A tibble: 1 × 3
## cumValue tot_deaths pctdiff
## <dbl> <dbl> <dbl>
## 1 0 0 0
saveToRDS(allCause_221021, ovrWriteError=FALSE)
# STEP 3: Facets for excess all-cause deaths
excessDeathFacets(lstCDC=cdcList_20221021, lstAll=allCause_221021, dateThru="2022-09-30", plotYLim=c(-200, 1200))
Restatement is also assessed:
# Plot for all ages
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20221021",
lstLabels=c("deaths_220425", "deaths_221021"),
minDate="2021-12-01"
)
##
## 20 states will be included: AL, AZ, CA, CO, FL, GA, IL, MD, MI, NC, NY, OH, OK, PA, SC, TN, TX, VA, WA, WI
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MD" "MI" "NC" "NY" "OH" "OK" "PA" "SC"
## [16] "TN" "TX" "VA" "WA" "WI"
# Plot for under 45
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20221021",
lstLabels=c("deaths_220425", "deaths_221021"),
lstFilter=list("age"=c("Under 25 years", "25-44 years")),
minDate="2021-12-01"
)
##
## 20 states will be included: AL, AZ, CA, CO, FL, GA, IL, MD, MI, NC, NY, OH, OK, PA, SC, TN, TX, VA, WA, WI
## $age
## [1] "Under 25 years" "25-44 years"
##
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MD" "MI" "NC" "NY" "OH" "OK" "PA" "SC"
## [16] "TN" "TX" "VA" "WA" "WI"
# Yearly and quarterly plots
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20221021",
lstLabels=c("deaths_220425", "deaths_221021"),
createStatePlot=FALSE,
restateYears=2022,
restateQuarterYears=2021:2022
)
##
## 17 states will be included: AL, AZ, CA, CO, FL, GA, IL, MI, NC, NY, OH, PA, SC, TN, TX, VA, WA
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MI" "NC" "NY" "OH" "PA" "SC" "TN" "TX"
## [16] "VA" "WA"
## # A tibble: 109,478 × 8
## state weekEnding age deaths_220425 deaths_…¹ delta neg base
## <chr> <date> <fct> <dbl> <dbl> <dbl> <lgl> <dbl>
## 1 AL 2015-01-10 Under 25 years 25 25 0 FALSE 25
## 2 AL 2015-01-10 25-44 years 67 67 0 FALSE 67
## 3 AL 2015-01-10 45-64 years 253 253 0 FALSE 253
## 4 AL 2015-01-10 65-74 years 202 202 0 FALSE 202
## 5 AL 2015-01-10 75-84 years 272 272 0 FALSE 272
## 6 AL 2015-01-10 85 years and older 320 320 0 FALSE 320
## 7 AL 2015-01-17 Under 25 years 28 28 0 FALSE 28
## 8 AL 2015-01-17 25-44 years 49 49 0 FALSE 49
## 9 AL 2015-01-17 45-64 years 256 256 0 FALSE 256
## 10 AL 2015-01-17 65-74 years 222 222 0 FALSE 222
## # … with 109,468 more rows, and abbreviated variable name ¹deaths_221021
## # ℹ Use `print(n = ...)` to see more rows
The process is updated with the latest data:
# STEP 1: Latest CDC all-cause deaths data
cdcLoc <- "Weekly_counts_of_deaths_by_jurisdiction_and_age_group_downloaded_20221112.csv"
cdcList_20221112 <- readRunCDCAllCause(loc=cdcLoc,
weekThru=31,
lst=readFromRDS("cdc_daily_221102"),
stateNoCheck=c(),
pdfCluster=TRUE,
pdfAge=TRUE
)
##
## Parameter cvDeathThru has been set as: 2022-08-06
##
##
## *** Data suppression checks ***
##
## Rows in states to be checked that have NA deaths or a note for suppression:
## [1] state weekEnding year week age Suppress deaths
## <0 rows> (or 0-length row.names)
##
##
## Problems by state:
## # A tibble: 0 × 5
## # … with 5 variables: noCheck <lgl>, state <chr>, problem <lgl>, n <int>,
## # deaths <dbl>
## # ℹ Use `colnames()` to see all variable names
## Warning in max(.): no non-missing arguments to max; returning -Inf
##
##
## There are 0 rows with errors; maximum for any given state is -Inf errors
##
##
## Data suppression checks passed
##
##
## *** File has been checked for uniqueness by: state year week age
##
## Rows: 110,321
## Columns: 12
## $ fullState <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Ala…
## $ weekEnding <date> 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10…
## $ state <chr> "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL",…
## $ year <fct> 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,…
## $ week <int> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4,…
## $ age <fct> Under 25 years, 25-44 years, 45-64 years, 65-74 years, 75-8…
## $ period <fct> 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015…
## $ Type <chr> "Predicted (weighted)", "Predicted (weighted)", "Predicted …
## $ Suppress <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ n <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ deaths <dbl> 25, 67, 253, 202, 272, 320, 28, 49, 256, 222, 253, 332, 26,…
## $ Note <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
##
## Check Control Levels and Record Counts for Processed Data:
##
##
## Checking variable combination: age
## # A tibble: 6 × 4
## age n n_deaths_na deaths
## <fct> <dbl> <dbl> <dbl>
## 1 Under 25 years 12981 0 451371
## 2 25-44 years 16675 0 1160681
## 3 45-64 years 20172 0 4396452
## 4 65-74 years 20164 0 4449803
## 5 75-84 years 20170 0 5445933
## 6 85 years and older 20159 0 6856886
##
##
## Checking variable combination: period year Type
## # A tibble: 8 × 6
## period year Type n n_deaths_na deaths
## <fct> <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 2015 Predicted (weighted) 14367 0 2698242
## 2 2015-2019 2016 Predicted (weighted) 14445 0 2725557
## 3 2015-2019 2017 Predicted (weighted) 14408 0 2802070
## 4 2015-2019 2018 Predicted (weighted) 14400 0 2830373
## 5 2015-2019 2019 Predicted (weighted) 14413 0 2843917
## 6 2020 2020 Predicted (weighted) 14834 0 3432820
## 7 2021 2021 Predicted (weighted) 14702 0 3450868
## 8 2022 2022 Predicted (weighted) 8752 0 1977279
##
##
## Checking variable combination: period Suppress
## # A tibble: 4 × 5
## period Suppress n n_deaths_na deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 <NA> 72033 0 13900159
## 2 2020 <NA> 14834 0 3432820
## 3 2021 <NA> 14702 0 3450868
## 4 2022 <NA> 8752 0 1977279
##
##
## Checking variable combination: period Note
## # A tibble: 12 × 5
## period Note n n_dea…¹ deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 <NA> 72033 0 1.39e7
## 2 2020 Data in recent weeks are incomplete. Only 60%… 279 0 8.69e4
## 3 2020 <NA> 14555 0 3.35e6
## 4 2021 Data in recent weeks are incomplete. Only 60%… 14131 0 3.19e6
## 5 2021 Data in recent weeks are incomplete. Only 60%… 11 0 3.15e3
## 6 2021 Data in recent weeks are incomplete. Only 60%… 9 0 2.8 e2
## 7 2021 Data in recent weeks are incomplete. Only 60%… 551 0 2.56e5
## 8 2022 Data in recent weeks are incomplete. Only 60%… 7823 0 1.67e6
## 9 2022 Data in recent weeks are incomplete. Only 60%… 338 0 8.30e4
## 10 2022 Data in recent weeks are incomplete. Only 60%… 14 0 4.14e2
## 11 2022 Data in recent weeks are incomplete. Only 60%… 559 0 2.24e5
## 12 2022 Data in recent weeks are incomplete. Only 60%… 18 0 4.40e3
## # … with abbreviated variable name ¹n_deaths_na
##
## *** File has been checked for uniqueness by: cluster year week
##
## Plots will be run after excluding stateNoCheck states
##
## Detailed cluster summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_cluster_2022w31.pdf
##
## Returning plot outputs to the main log file
## Joining, by = "state"
##
## Detailed age summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_age_2022w31.pdf
##
## Returning plot outputs to the main log file
saveToRDS(cdcList_20221112, ovrWriteError=FALSE)
# STEP 2: Latest death by location-cause data
allCause_221112 <- analyzeAllCause(loc="COvID_deaths_age_place_20221112.csv",
cdcDailyList=readFromRDS("cdc_daily_221102"),
compareThruDate="2022-10-13"
)
## `summarise()` has grouped output by 'State'. You can override using the
## `.groups` argument.
##
## States without abbreviations
## # A tibble: 2 × 10
## # Groups: State [2]
## State abb Year Month covid…¹ total…² pneum…³ pneum…⁴ fluDe…⁵ pnemo…⁶
## <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 New York Ci… <NA> 0 0 36392 195585 24452 13387 1080 47686
## 2 Puerto Rico <NA> 0 0 5310 94135 13171 3714 207 14944
## # … with abbreviated variable names ¹covidDeaths, ²totalDeaths, ³pneumoDeaths,
## # ⁴pneumoCovidDeaths, ⁵fluDeaths, ⁶pnemoFluCovidDeaths
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 2,063 × 12
## asofDate startDate endDate Group State death…¹ Age name dfSub dfTot
## <date> <date> <date> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 2022-11-09 2022-01-01 2022-11-05 By Ye… Unit… Decede… All … fluD… 136 338
## 2 2022-11-09 2020-10-01 2020-10-31 By Mo… Unit… Total … 30-3… pnem… 205 401
## 3 2022-11-09 2021-10-01 2021-10-31 By Mo… Unit… Decede… 40-4… pnem… 150 346
## 4 2022-11-09 2020-02-01 2020-02-29 By Mo… Unit… Total … 30-3… pnem… 71 261
## 5 2022-11-09 2021-11-01 2021-11-30 By Mo… Unit… Health… 75-8… pnem… 139 329
## 6 2022-11-09 2022-04-01 2022-04-30 By Mo… Unit… Total … All … fluD… 216 406
## 7 2022-11-09 2020-11-01 2020-11-30 By Mo… Unit… Total … 30-3… pneu… 227 413
## 8 2022-11-09 2021-08-01 2021-08-31 By Mo… Unit… Other All … pneu… 627 812
## 9 2022-11-09 2020-08-01 2020-08-31 By Mo… Unit… Other 0-17… tota… 116 297
## 10 2022-11-09 2020-09-01 2020-09-30 By Mo… Unit… Decede… 50-6… pnem… 190 370
## # … with 2,053 more rows, 2 more variables: delta <dbl>, pct <dbl>, and
## # abbreviated variable name ¹deathPlace
## # ℹ Use `print(n = ...)` to see more rows, and `colnames()` to see all variable names
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 0 × 12
## # … with 12 variables: asofDate <date>, startDate <date>, endDate <date>,
## # Group <chr>, State <chr>, deathPlace <chr>, Age <chr>, name <chr>,
## # dfSub <dbl>, dfTot <dbl>, delta <dbl>, pct <dbl>
## # ℹ Use `colnames()` to see all variable names
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 0 × 12
## # … with 12 variables: asofDate <date>, startDate <date>, endDate <date>,
## # Group <chr>, State <chr>, deathPlace <chr>, Age <chr>, name <chr>,
## # dfSub <dbl>, dfTot <dbl>, delta <dbl>, pct <dbl>
## # ℹ Use `colnames()` to see all variable names
## # A tibble: 0 × 4
## # … with 4 variables: abb <chr>, cumValue <dbl>, tot_deaths <dbl>,
## # pctdiff <dbl>
## # ℹ Use `colnames()` to see all variable names
## # A tibble: 1 × 3
## cumValue tot_deaths pctdiff
## <dbl> <dbl> <dbl>
## 1 0 0 0
saveToRDS(allCause_221112, ovrWriteError=FALSE)
# STEP 3: Facets for excess all-cause deaths
excessDeathFacets(lstCDC=cdcList_20221112, lstAll=allCause_221112, dateThru="2022-10-15", plotYLim=c(-200, 1200))
Restatement is also assessed:
# Plot for all ages
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20221112",
lstLabels=c("deaths_220425", "deaths_221112"),
minDate="2021-12-01"
)
##
## 20 states will be included: AL, AZ, CA, CO, FL, GA, IL, MD, MI, NC, NY, OH, OK, PA, SC, TN, TX, VA, WA, WI
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MD" "MI" "NC" "NY" "OH" "OK" "PA" "SC"
## [16] "TN" "TX" "VA" "WA" "WI"
# Plot for under 45
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20221112",
lstLabels=c("deaths_220425", "deaths_221112"),
lstFilter=list("age"=c("Under 25 years", "25-44 years")),
minDate="2021-12-01"
)
##
## 20 states will be included: AL, AZ, CA, CO, FL, GA, IL, MD, MI, NC, NY, OH, OK, PA, SC, TN, TX, VA, WA, WI
## $age
## [1] "Under 25 years" "25-44 years"
##
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MD" "MI" "NC" "NY" "OH" "OK" "PA" "SC"
## [16] "TN" "TX" "VA" "WA" "WI"
# Yearly and quarterly plots
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20221112",
lstLabels=c("deaths_220425", "deaths_221112"),
createStatePlot=FALSE,
restateYears=2022,
restateQuarterYears=2021:2022
)
##
## 17 states will be included: AL, AZ, CA, CO, FL, GA, IL, MI, NC, NY, OH, PA, SC, TN, TX, VA, WA
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MI" "NC" "NY" "OH" "PA" "SC" "TN" "TX"
## [16] "VA" "WA"
## # A tibble: 110,324 × 8
## state weekEnding age deaths_220425 deaths_…¹ delta neg base
## <chr> <date> <fct> <dbl> <dbl> <dbl> <lgl> <dbl>
## 1 AL 2015-01-10 Under 25 years 25 25 0 FALSE 25
## 2 AL 2015-01-10 25-44 years 67 67 0 FALSE 67
## 3 AL 2015-01-10 45-64 years 253 253 0 FALSE 253
## 4 AL 2015-01-10 65-74 years 202 202 0 FALSE 202
## 5 AL 2015-01-10 75-84 years 272 272 0 FALSE 272
## 6 AL 2015-01-10 85 years and older 320 320 0 FALSE 320
## 7 AL 2015-01-17 Under 25 years 28 28 0 FALSE 28
## 8 AL 2015-01-17 25-44 years 49 49 0 FALSE 49
## 9 AL 2015-01-17 45-64 years 256 256 0 FALSE 256
## 10 AL 2015-01-17 65-74 years 222 222 0 FALSE 222
## # … with 110,314 more rows, and abbreviated variable name ¹deaths_221112
## # ℹ Use `print(n = ...)` to see more rows
The process is updated with the latest data:
# STEP 1: Latest CDC all-cause deaths data
cdcLoc <- "Weekly_counts_of_deaths_by_jurisdiction_and_age_group_downloaded_20221214.csv"
cdcList_20221214 <- readRunCDCAllCause(loc=cdcLoc,
weekThru=36,
lst=readFromRDS("cdc_daily_221202"),
stateNoCheck=c(),
pdfCluster=TRUE,
pdfAge=TRUE
)
##
## Parameter cvDeathThru has been set as: 2022-09-10
##
##
## *** Data suppression checks ***
##
## Rows in states to be checked that have NA deaths or a note for suppression:
## [1] state weekEnding year week age Suppress deaths
## <0 rows> (or 0-length row.names)
##
##
## Problems by state:
## # A tibble: 0 × 5
## # … with 5 variables: noCheck <lgl>, state <chr>, problem <lgl>, n <int>,
## # deaths <dbl>
## Warning in max(.): no non-missing arguments to max; returning -Inf
##
##
## There are 0 rows with errors; maximum for any given state is -Inf errors
##
##
## Data suppression checks passed
##
##
## *** File has been checked for uniqueness by: state year week age
##
## Rows: 111,727
## Columns: 12
## $ fullState <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Ala…
## $ weekEnding <date> 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10…
## $ state <chr> "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL",…
## $ year <fct> 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,…
## $ week <int> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4,…
## $ age <fct> Under 25 years, 25-44 years, 45-64 years, 65-74 years, 75-8…
## $ period <fct> 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015…
## $ Type <chr> "Predicted (weighted)", "Predicted (weighted)", "Predicted …
## $ Suppress <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ n <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ deaths <dbl> 25, 67, 253, 202, 272, 320, 28, 49, 256, 222, 253, 332, 26,…
## $ Note <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
##
## Check Control Levels and Record Counts for Processed Data:
##
##
## Checking variable combination: age
## # A tibble: 6 × 4
## age n n_deaths_na deaths
## <fct> <dbl> <dbl> <dbl>
## 1 Under 25 years 13148 0 457189
## 2 25-44 years 16894 0 1178266
## 3 45-64 years 20427 0 4449603
## 4 65-74 years 20419 0 4509284
## 5 75-84 years 20425 0 5519191
## 6 85 years and older 20414 0 6939094
##
##
## Checking variable combination: period year Type
## # A tibble: 8 × 6
## period year Type n n_deaths_na deaths
## <fct> <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 2015 Predicted (weighted) 14367 0 2698242
## 2 2015-2019 2016 Predicted (weighted) 14445 0 2725557
## 3 2015-2019 2017 Predicted (weighted) 14408 0 2802070
## 4 2015-2019 2018 Predicted (weighted) 14400 0 2830373
## 5 2015-2019 2019 Predicted (weighted) 14413 0 2843917
## 6 2020 2020 Predicted (weighted) 14834 0 3432820
## 7 2021 2021 Predicted (weighted) 14703 0 3450588
## 8 2022 2022 Predicted (weighted) 10157 0 2269060
##
##
## Checking variable combination: period Suppress
## # A tibble: 4 × 5
## period Suppress n n_deaths_na deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 <NA> 72033 0 13900159
## 2 2020 <NA> 14834 0 3432820
## 3 2021 <NA> 14703 0 3450588
## 4 2022 <NA> 10157 0 2269060
##
##
## Checking variable combination: period Note
## # A tibble: 11 × 5
## period Note n n_dea…¹ deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 <NA> 72033 0 1.39e7
## 2 2020 Data in recent weeks are incomplete. Only 60%… 279 0 8.69e4
## 3 2020 <NA> 14555 0 3.35e6
## 4 2021 Data in recent weeks are incomplete. Only 60%… 14415 0 3.38e6
## 5 2021 Data in recent weeks are incomplete. Only 60%… 277 0 7.12e4
## 6 2021 Data in recent weeks are incomplete. Only 60%… 11 0 3.15e3
## 7 2022 Data in recent weeks are incomplete. Only 60%… 8964 0 1.91e6
## 8 2022 Data in recent weeks are incomplete. Only 60%… 406 0 9.86e4
## 9 2022 Data in recent weeks are incomplete. Only 60%… 21 0 6.51e2
## 10 2022 Data in recent weeks are incomplete. Only 60%… 760 0 2.54e5
## 11 2022 Data in recent weeks are incomplete. Only 60%… 6 0 1.96e3
## # … with abbreviated variable name ¹n_deaths_na
##
## *** File has been checked for uniqueness by: cluster year week
##
## Plots will be run after excluding stateNoCheck states
##
## Detailed cluster summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_cluster_2022w36.pdf
##
## Returning plot outputs to the main log file
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation ideoms with `aes()`
## Joining, by = "state"
##
## Detailed age summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_age_2022w36.pdf
##
## Returning plot outputs to the main log file
saveToRDS(cdcList_20221214, ovrWriteError=FALSE)
# STEP 2: Latest death by location-cause data
allCause_221214 <- analyzeAllCause(loc="COvID_deaths_age_place_20221214.csv",
cdcDailyList=readFromRDS("cdc_daily_221202"),
compareThruDate="2022-11-30"
)
## `summarise()` has grouped output by 'State'. You can override using the
## `.groups` argument.
##
## States without abbreviations
## # A tibble: 2 × 10
## # Groups: State [2]
## State abb Year Month covid…¹ total…² pneum…³ pneum…⁴ fluDe…⁵ pnemo…⁶
## <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 New York Ci… <NA> 0 0 36775 201627 24983 13504 1106 48509
## 2 Puerto Rico <NA> 0 0 5476 97629 13669 3827 222 15509
## # … with abbreviated variable names ¹covidDeaths, ²totalDeaths, ³pneumoDeaths,
## # ⁴pneumoCovidDeaths, ⁵fluDeaths, ⁶pnemoFluCovidDeaths
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 2,178 × 12
## asofDate startDate endDate Group State death…¹ Age name dfSub dfTot
## <date> <date> <date> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 2022-12-14 2020-10-01 2020-10-31 By Mo… Unit… Total … 30-3… pnem… 205 401
## 2 2022-12-14 2021-10-01 2021-10-31 By Mo… Unit… Decede… 40-4… pnem… 150 346
## 3 2022-12-14 2020-02-01 2020-02-29 By Mo… Unit… Total … 30-3… pnem… 71 261
## 4 2022-12-14 2021-11-01 2021-11-30 By Mo… Unit… Health… 75-8… pnem… 139 329
## 5 2022-12-14 2022-04-01 2022-04-30 By Mo… Unit… Total … All … fluD… 217 407
## 6 2022-12-14 2020-11-01 2020-11-30 By Mo… Unit… Total … 30-3… pneu… 227 413
## 7 2022-12-14 2021-08-01 2021-08-31 By Mo… Unit… Other All … pneu… 627 812
## 8 2022-12-14 2022-10-01 2022-10-31 By Mo… Unit… Health… 40-4… pneu… 131 313
## 9 2022-12-14 2020-08-01 2020-08-31 By Mo… Unit… Other 0-17… tota… 116 297
## 10 2022-12-14 2020-09-01 2020-09-30 By Mo… Unit… Decede… 50-6… pnem… 190 370
## # … with 2,168 more rows, 2 more variables: delta <dbl>, pct <dbl>, and
## # abbreviated variable name ¹deathPlace
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 0 × 12
## # … with 12 variables: asofDate <date>, startDate <date>, endDate <date>,
## # Group <chr>, State <chr>, deathPlace <chr>, Age <chr>, name <chr>,
## # dfSub <dbl>, dfTot <dbl>, delta <dbl>, pct <dbl>
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 0 × 12
## # … with 12 variables: asofDate <date>, startDate <date>, endDate <date>,
## # Group <chr>, State <chr>, deathPlace <chr>, Age <chr>, name <chr>,
## # dfSub <dbl>, dfTot <dbl>, delta <dbl>, pct <dbl>
## # A tibble: 51 × 4
## abb cumValue tot_deaths pctdiff
## <chr> <dbl> <dbl> <dbl>
## 1 NY 39339 73751 0.304
## 2 DC 2113 1407 0.201
## 3 ND 2951 2232 0.139
## 4 GA 34564 41070 0.0860
## 5 NC 31844 27375 0.0755
## 6 OH 46774 40466 0.0723
## 7 OK 17337 15049 0.0706
## 8 MI 34834 40085 0.0701
## 9 NE 5338 4663 0.0675
## 10 MA 19784 22488 0.0640
## # … with 41 more rows
## # A tibble: 1 × 3
## cumValue tot_deaths pctdiff
## <dbl> <dbl> <dbl>
## 1 1040783 1071245 2.01
saveToRDS(allCause_221214, ovrWriteError=FALSE)
# STEP 3: Facets for excess all-cause deaths
excessDeathFacets(lstCDC=cdcList_20221214, lstAll=allCause_221214, dateThru="2022-11-30", plotYLim=c(-200, 1200))
Restatement is also assessed:
# Plot for all ages
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20221214",
lstLabels=c("deaths_220425", "deaths_221214"),
minDate="2021-12-01"
)
##
## 20 states will be included: AL, AZ, CA, CO, FL, GA, IL, MD, MI, NC, NY, OH, OK, PA, SC, TN, TX, VA, WA, WI
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MD" "MI" "NC" "NY" "OH" "OK" "PA" "SC"
## [16] "TN" "TX" "VA" "WA" "WI"
# Plot for under 45
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20221214",
lstLabels=c("deaths_220425", "deaths_221214"),
lstFilter=list("age"=c("Under 25 years", "25-44 years")),
minDate="2021-12-01"
)
##
## 20 states will be included: AL, AZ, CA, CO, FL, GA, IL, MD, MI, NC, NY, OH, OK, PA, SC, TN, TX, VA, WA, WI
## $age
## [1] "Under 25 years" "25-44 years"
##
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MD" "MI" "NC" "NY" "OH" "OK" "PA" "SC"
## [16] "TN" "TX" "VA" "WA" "WI"
# Yearly and quarterly plots
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20221214",
lstLabels=c("deaths_220425", "deaths_221214"),
createStatePlot=FALSE,
restateYears=2022,
restateQuarterYears=2021:2022
)
##
## 17 states will be included: AL, AZ, CA, CO, FL, GA, IL, MI, NC, NY, OH, PA, SC, TN, TX, VA, WA
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MI" "NC" "NY" "OH" "PA" "SC" "TN" "TX"
## [16] "VA" "WA"
## # A tibble: 111,732 × 8
## state weekEnding age deaths_220425 deaths_…¹ delta neg base
## <chr> <date> <fct> <dbl> <dbl> <dbl> <lgl> <dbl>
## 1 AL 2015-01-10 Under 25 years 25 25 0 FALSE 25
## 2 AL 2015-01-10 25-44 years 67 67 0 FALSE 67
## 3 AL 2015-01-10 45-64 years 253 253 0 FALSE 253
## 4 AL 2015-01-10 65-74 years 202 202 0 FALSE 202
## 5 AL 2015-01-10 75-84 years 272 272 0 FALSE 272
## 6 AL 2015-01-10 85 years and older 320 320 0 FALSE 320
## 7 AL 2015-01-17 Under 25 years 28 28 0 FALSE 28
## 8 AL 2015-01-17 25-44 years 49 49 0 FALSE 49
## 9 AL 2015-01-17 45-64 years 256 256 0 FALSE 256
## 10 AL 2015-01-17 65-74 years 222 222 0 FALSE 222
## # … with 111,722 more rows, and abbreviated variable name ¹deaths_221214
The process is updated with the latest data:
# STEP 1: Latest CDC all-cause deaths data
cdcLoc <- "Weekly_counts_of_deaths_by_jurisdiction_and_age_group_downloaded_20230113.csv"
cdcList_20230113 <- readRunCDCAllCause(loc=cdcLoc,
weekThru=41,
lst=readFromRDS("cdc_daily_230102"),
stateNoCheck=c(),
pdfCluster=TRUE,
pdfAge=TRUE
)
##
## Parameter cvDeathThru has been set as: 2022-10-15
##
##
## *** Data suppression checks ***
##
## Rows in states to be checked that have NA deaths or a note for suppression:
## [1] state weekEnding year week age Suppress deaths
## <0 rows> (or 0-length row.names)
##
##
## Problems by state:
## # A tibble: 0 × 5
## # … with 5 variables: noCheck <lgl>, state <chr>, problem <lgl>, n <int>,
## # deaths <dbl>
## Warning in max(.): no non-missing arguments to max; returning -Inf
##
##
## There are 0 rows with errors; maximum for any given state is -Inf errors
##
##
## Data suppression checks passed
##
##
## *** File has been checked for uniqueness by: state year week age
##
## Rows: 113,131
## Columns: 12
## $ fullState <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Ala…
## $ weekEnding <date> 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10…
## $ state <chr> "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL",…
## $ year <fct> 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,…
## $ week <int> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4,…
## $ age <fct> Under 25 years, 25-44 years, 45-64 years, 65-74 years, 75-8…
## $ period <fct> 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015…
## $ Type <chr> "Predicted (weighted)", "Predicted (weighted)", "Predicted …
## $ Suppress <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ n <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ deaths <dbl> 25, 67, 253, 202, 272, 320, 28, 49, 256, 222, 253, 332, 26,…
## $ Note <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
##
## Check Control Levels and Record Counts for Processed Data:
##
##
## Checking variable combination: age
## # A tibble: 6 × 4
## age n n_deaths_na deaths
## <fct> <dbl> <dbl> <dbl>
## 1 Under 25 years 13308 0 462974
## 2 25-44 years 17118 0 1195123
## 3 45-64 years 20682 0 4503237
## 4 65-74 years 20674 0 4569187
## 5 75-84 years 20680 0 5593572
## 6 85 years and older 20669 0 7023354
##
##
## Checking variable combination: period year Type
## # A tibble: 8 × 6
## period year Type n n_deaths_na deaths
## <fct> <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 2015 Predicted (weighted) 14367 0 2698242
## 2 2015-2019 2016 Predicted (weighted) 14445 0 2725557
## 3 2015-2019 2017 Predicted (weighted) 14408 0 2802070
## 4 2015-2019 2018 Predicted (weighted) 14400 0 2830373
## 5 2015-2019 2019 Predicted (weighted) 14413 0 2843917
## 6 2020 2020 Predicted (weighted) 14834 0 3432820
## 7 2021 2021 Predicted (weighted) 14701 0 3450597
## 8 2022 2022 Predicted (weighted) 11563 0 2563871
##
##
## Checking variable combination: period Suppress
## # A tibble: 4 × 5
## period Suppress n n_deaths_na deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 <NA> 72033 0 13900159
## 2 2020 <NA> 14834 0 3432820
## 3 2021 <NA> 14701 0 3450597
## 4 2022 <NA> 11563 0 2563871
##
##
## Checking variable combination: period Note
## # A tibble: 7 × 5
## period Note n n_dea…¹ deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 <NA> 72033 0 1.39e7
## 2 2020 <NA> 14834 0 3.43e6
## 3 2021 Data in recent weeks are incomplete. Only 60% … 288 0 7.44e4
## 4 2021 <NA> 14413 0 3.38e6
## 5 2022 Data in recent weeks are incomplete. Only 60% … 10569 0 2.21e6
## 6 2022 Data in recent weeks are incomplete. Only 60% … 13 0 4 e2
## 7 2022 Data in recent weeks are incomplete. Only 60% … 981 0 3.50e5
## # … with abbreviated variable name ¹n_deaths_na
##
## *** File has been checked for uniqueness by: cluster year week
##
## Plots will be run after excluding stateNoCheck states
##
## Detailed cluster summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_cluster_2022w41.pdf
##
## Returning plot outputs to the main log file
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation ideoms with `aes()`
## Joining, by = "state"
##
## Detailed age summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_age_2022w41.pdf
##
## Returning plot outputs to the main log file
saveToRDS(cdcList_20230113, ovrWriteError=FALSE)
# STEP 2: Latest death by location-cause data
allCause_230113 <- analyzeAllCause(loc="COvID_deaths_age_place_20230113.csv",
cdcDailyList=readFromRDS("cdc_daily_230102"),
compareThruDate="2022-12-31"
)
## `summarise()` has grouped output by 'State'. You can override using the
## `.groups` argument.
##
## States without abbreviations
## # A tibble: 2 × 10
## # Groups: State [2]
## State abb Year Month covid…¹ total…² pneum…³ pneum…⁴ fluDe…⁵ pnemo…⁶
## <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 New York Ci… <NA> 0 0 37194 206805 25485 13616 1185 49390
## 2 Puerto Rico <NA> 0 0 5600 100123 14067 3904 241 15970
## # … with abbreviated variable names ¹covidDeaths, ²totalDeaths, ³pneumoDeaths,
## # ⁴pneumoCovidDeaths, ⁵fluDeaths, ⁶pnemoFluCovidDeaths
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 2,266 × 12
## asofDate startDate endDate Group State death…¹ Age name dfSub dfTot
## <date> <date> <date> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 2023-01-11 2020-10-01 2020-10-31 By Mo… Unit… Total … 30-3… pnem… 205 401
## 2 2023-01-11 2021-10-01 2021-10-31 By Mo… Unit… Decede… 40-4… pnem… 150 346
## 3 2023-01-11 2020-02-01 2020-02-29 By Mo… Unit… Total … 30-3… pnem… 71 261
## 4 2023-01-11 2021-11-01 2021-11-30 By Mo… Unit… Health… 75-8… pnem… 139 329
## 5 2023-01-11 2022-04-01 2022-04-30 By Mo… Unit… Total … All … fluD… 217 407
## 6 2023-01-11 2020-01-01 2023-01-07 By To… Unit… Health… 30-3… fluD… 167 355
## 7 2023-01-11 2020-11-01 2020-11-30 By Mo… Unit… Total … 30-3… pneu… 227 413
## 8 2023-01-11 2021-08-01 2021-08-31 By Mo… Unit… Other All … pneu… 627 812
## 9 2023-01-11 2020-08-01 2020-08-31 By Mo… Unit… Other 0-17… tota… 116 297
## 10 2023-01-11 2020-09-01 2020-09-30 By Mo… Unit… Decede… 50-6… pnem… 190 370
## # … with 2,256 more rows, 2 more variables: delta <dbl>, pct <dbl>, and
## # abbreviated variable name ¹deathPlace
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 0 × 12
## # … with 12 variables: asofDate <date>, startDate <date>, endDate <date>,
## # Group <chr>, State <chr>, deathPlace <chr>, Age <chr>, name <chr>,
## # dfSub <dbl>, dfTot <dbl>, delta <dbl>, pct <dbl>
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 0 × 12
## # … with 12 variables: asofDate <date>, startDate <date>, endDate <date>,
## # Group <chr>, State <chr>, deathPlace <chr>, Age <chr>, name <chr>,
## # dfSub <dbl>, dfTot <dbl>, delta <dbl>, pct <dbl>
## # A tibble: 51 × 4
## abb cumValue tot_deaths pctdiff
## <chr> <dbl> <dbl> <dbl>
## 1 AL 20575 NA NA
## 2 AK 1389 NA NA
## 3 AZ 29195 NA NA
## 4 AR 12097 NA NA
## 5 CA 103568 NA NA
## 6 CO 14616 NA NA
## 7 CT 11958 NA NA
## 8 DE 3206 NA NA
## 9 DC 2131 NA NA
## 10 FL 77425 NA NA
## # … with 41 more rows
## # A tibble: 1 × 3
## cumValue tot_deaths pctdiff
## <dbl> <dbl> <dbl>
## 1 1052743 NA NA
## Warning: Removed 51 rows containing missing values (`geom_point()`).
saveToRDS(allCause_230113, ovrWriteError=FALSE)
# STEP 3: Facets for excess all-cause deaths
excessDeathFacets(lstCDC=cdcList_20230113, lstAll=allCause_230113, dateThru="2022-12-31", plotYLim=c(-200, 1200))
Restatement is also assessed:
# Plot for all ages
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20230113",
lstLabels=c("deaths_220425", "deaths_230113"),
minDate="2021-12-01"
)
##
## 20 states will be included: AL, AZ, CA, CO, FL, GA, IL, MD, MI, NC, NY, OH, OK, PA, SC, TN, TX, VA, WA, WI
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MD" "MI" "NC" "NY" "OH" "OK" "PA" "SC"
## [16] "TN" "TX" "VA" "WA" "WI"
# Plot for under 45
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20230113",
lstLabels=c("deaths_220425", "deaths_230113"),
lstFilter=list("age"=c("Under 25 years", "25-44 years")),
minDate="2021-12-01"
)
##
## 20 states will be included: AL, AZ, CA, CO, FL, GA, IL, MD, MI, NC, NY, OH, OK, PA, SC, TN, TX, VA, WA, WI
## $age
## [1] "Under 25 years" "25-44 years"
##
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MD" "MI" "NC" "NY" "OH" "OK" "PA" "SC"
## [16] "TN" "TX" "VA" "WA" "WI"
# Yearly and quarterly plots
plotCDCRestatement(lstOld="cdcList_arch_2022w17",
lstNew="cdcList_20230113",
lstLabels=c("deaths_220425", "deaths_230113"),
createStatePlot=FALSE,
restateYears=2022,
restateQuarterYears=2021:2022
)
##
## 17 states will be included: AL, AZ, CA, CO, FL, GA, IL, MI, NC, NY, OH, PA, SC, TN, TX, VA, WA
## $age
## [1] "AL" "AZ" "CA" "CO" "FL" "GA" "IL" "MI" "NC" "NY" "OH" "PA" "SC" "TN" "TX"
## [16] "VA" "WA"
## # A tibble: 113,137 × 8
## state weekEnding age deaths_220425 deaths_…¹ delta neg base
## <chr> <date> <fct> <dbl> <dbl> <dbl> <lgl> <dbl>
## 1 AL 2015-01-10 Under 25 years 25 25 0 FALSE 25
## 2 AL 2015-01-10 25-44 years 67 67 0 FALSE 67
## 3 AL 2015-01-10 45-64 years 253 253 0 FALSE 253
## 4 AL 2015-01-10 65-74 years 202 202 0 FALSE 202
## 5 AL 2015-01-10 75-84 years 272 272 0 FALSE 272
## 6 AL 2015-01-10 85 years and older 320 320 0 FALSE 320
## 7 AL 2015-01-17 Under 25 years 28 28 0 FALSE 28
## 8 AL 2015-01-17 25-44 years 49 49 0 FALSE 49
## 9 AL 2015-01-17 45-64 years 256 256 0 FALSE 256
## 10 AL 2015-01-17 65-74 years 222 222 0 FALSE 222
## # … with 113,127 more rows, and abbreviated variable name ¹deaths_230113
The process is updated with the latest data:
# STEP 1: Latest CDC all-cause deaths data
cdcLoc <- "Weekly_counts_of_deaths_by_jurisdiction_and_age_group_downloaded_20230213.csv"
cdcList_20230213 <- readRunCDCAllCause(loc=cdcLoc,
weekThru=46,
lst=readFromRDS("cdc_daily_230202"),
stateNoCheck=c(),
pdfCluster=TRUE,
pdfAge=TRUE
)
##
## Parameter cvDeathThru has been set as: 2022-11-19
##
##
## *** Data suppression checks ***
##
## Rows in states to be checked that have NA deaths or a note for suppression:
## state weekEnding year week age
## 1 AK 2023-01-28 <NA> 4 65-74 years
## 2 LA 2023-01-28 <NA> 4 45-64 years
## 3 LA 2023-01-28 <NA> 4 65-74 years
## 4 LA 2023-01-28 <NA> 4 75-84 years
## 5 LA 2023-01-28 <NA> 4 85 years and older
## Suppress deaths
## 1 Suppressed (counts highly incomplete, <50% of expected) NA
## 2 Suppressed (counts highly incomplete, <50% of expected) NA
## 3 Suppressed (counts highly incomplete, <50% of expected) NA
## 4 Suppressed (counts highly incomplete, <50% of expected) NA
## 5 Suppressed (counts highly incomplete, <50% of expected) NA
##
##
## Problems by state:
## # A tibble: 2 × 5
## noCheck state problem n deaths
## <lgl> <chr> <lgl> <int> <dbl>
## 1 FALSE AK TRUE 1 NA
## 2 FALSE LA TRUE 4 NA
##
##
## There are 5 rows with errors; maximum for any given state is 4 errors
##
##
## Data suppression checks passed
##
##
## *** File has been checked for uniqueness by: state year week age
##
## Rows: 115,583
## Columns: 12
## $ fullState <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Ala…
## $ weekEnding <date> 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10…
## $ state <chr> "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL",…
## $ year <fct> 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,…
## $ week <int> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4,…
## $ age <fct> Under 25 years, 25-44 years, 45-64 years, 65-74 years, 75-8…
## $ period <fct> 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015…
## $ Type <chr> "Predicted (weighted)", "Predicted (weighted)", "Predicted …
## $ Suppress <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ n <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ deaths <dbl> 25, 67, 253, 202, 272, 320, 28, 49, 256, 222, 253, 332, 26,…
## $ Note <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
##
## Check Control Levels and Record Counts for Processed Data:
##
##
## Checking variable combination: age
## # A tibble: 6 × 4
## age n n_deaths_na deaths
## <fct> <dbl> <dbl> <dbl>
## 1 Under 25 years 13556 0 470892
## 2 25-44 years 17502 0 1220945
## 3 45-64 years 21135 0 4595485
## 4 65-74 years 21129 0 4677835
## 5 75-84 years 21136 0 5734310
## 6 85 years and older 21125 0 7185838
##
##
## Checking variable combination: period year Type
## # A tibble: 9 × 6
## period year Type n n_deaths_na deaths
## <fct> <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 2015 Predicted (weighted) 14367 0 2698242
## 2 2015-2019 2016 Predicted (weighted) 14445 0 2725557
## 3 2015-2019 2017 Predicted (weighted) 14408 0 2802070
## 4 2015-2019 2018 Predicted (weighted) 14400 0 2830373
## 5 2015-2019 2019 Predicted (weighted) 14413 0 2843917
## 6 2020 2020 Predicted (weighted) 14834 0 3432820
## 7 2021 2021 Predicted (weighted) 14704 0 3450608
## 8 2022 2022 Predicted (weighted) 12978 0 2867938
## 9 <NA> <NA> Predicted (weighted) 1034 0 233780
##
##
## Checking variable combination: period Suppress
## # A tibble: 5 × 5
## period Suppress n n_deaths_na deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 <NA> 72033 0 13900159
## 2 2020 <NA> 14834 0 3432820
## 3 2021 <NA> 14704 0 3450608
## 4 2022 <NA> 12978 0 2867938
## 5 <NA> <NA> 1034 0 233780
##
##
## Checking variable combination: period Note
## # A tibble: 9 × 5
## period Note n n_dea…¹ deaths
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 2015-2019 <NA> 72033 0 1.39e7
## 2 2020 <NA> 14834 0 3.43e6
## 3 2021 Data in recent weeks are incomplete. Only 60% … 288 0 7.44e4
## 4 2021 <NA> 14416 0 3.38e6
## 5 2022 Data in recent weeks are incomplete. Only 60% … 11669 0 2.41e6
## 6 2022 Data in recent weeks are incomplete. Only 60% … 12 0 4.15e2
## 7 2022 Data in recent weeks are incomplete. Only 60% … 1297 0 4.54e5
## 8 <NA> Data in recent weeks are incomplete. Only 60% … 990 0 2.24e5
## 9 <NA> Data in recent weeks are incomplete. Only 60% … 44 0 9.89e3
## # … with abbreviated variable name ¹n_deaths_na
##
## *** File has been checked for uniqueness by: cluster year week
## Warning in max(.): no non-missing arguments to max; returning -Inf
##
## Plots will be run after excluding stateNoCheck states
## Warning: Removed 4 rows containing missing values (`geom_line()`).
## Warning: Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
##
## Detailed cluster summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_cluster_2022w46.pdf
## Warning: Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
##
## Returning plot outputs to the main log file
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation ideoms with `aes()`
## Joining, by = "state"
##
## Detailed age summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_age_2022w46.pdf
## Warning: Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
## Removed 4 rows containing missing values (`geom_line()`).
##
## Returning plot outputs to the main log file
saveToRDS(cdcList_20230213, ovrWriteError=FALSE)
# STEP 2: Latest death by location-cause data
allCause_230213 <- analyzeAllCause(loc="COvID_deaths_age_place_20230213.csv",
cdcDailyList=readFromRDS("cdc_daily_230202"),
compareThruDate="2023-01-31"
)
## `summarise()` has grouped output by 'State'. You can override using the
## `.groups` argument.
##
## States without abbreviations
## # A tibble: 2 × 10
## # Groups: State [2]
## State abb Year Month covid…¹ total…² pneum…³ pneum…⁴ fluDe…⁵ pnemo…⁶
## <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 New York Ci… <NA> 0 0 37542 211447 25925 13736 1216 50086
## 2 Puerto Rico <NA> 0 0 5749 103190 14533 4006 254 16495
## # … with abbreviated variable names ¹covidDeaths, ²totalDeaths, ³pneumoDeaths,
## # ⁴pneumoCovidDeaths, ⁵fluDeaths, ⁶pnemoFluCovidDeaths
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 2,346 × 12
## asofDate startDate endDate Group State death…¹ Age name dfSub dfTot
## <date> <date> <date> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 2023-02-08 2020-10-01 2020-10-31 By Mo… Unit… Total … 30-3… pnem… 205 401
## 2 2023-02-08 2021-10-01 2021-10-31 By Mo… Unit… Decede… 40-4… pnem… 150 346
## 3 2023-02-08 2020-02-01 2020-02-29 By Mo… Unit… Total … 30-3… pnem… 71 261
## 4 2023-02-08 2021-11-01 2021-11-30 By Mo… Unit… Health… 75-8… pnem… 139 329
## 5 2023-02-08 2022-04-01 2022-04-30 By Mo… Unit… Total … All … fluD… 217 407
## 6 2023-02-08 2020-11-01 2020-11-30 By Mo… Unit… Total … 30-3… pneu… 227 413
## 7 2023-02-08 2021-08-01 2021-08-31 By Mo… Unit… Other All … pneu… 627 812
## 8 2023-02-08 2020-08-01 2020-08-31 By Mo… Unit… Other 0-17… tota… 116 297
## 9 2023-02-08 2020-09-01 2020-09-30 By Mo… Unit… Decede… 50-6… pnem… 190 370
## 10 2023-02-08 2021-10-01 2021-10-31 By Mo… Unit… Decede… 65-7… pneu… 87 267
## # … with 2,336 more rows, 2 more variables: delta <dbl>, pct <dbl>, and
## # abbreviated variable name ¹deathPlace
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 0 × 12
## # … with 12 variables: asofDate <date>, startDate <date>, endDate <date>,
## # Group <chr>, State <chr>, deathPlace <chr>, Age <chr>, name <chr>,
## # dfSub <dbl>, dfTot <dbl>, delta <dbl>, pct <dbl>
##
## Sub-lists are identical by: asofDate, startDate, endDate, Group, State, deathPlace, Age
## # A tibble: 0 × 12
## # … with 12 variables: asofDate <date>, startDate <date>, endDate <date>,
## # Group <chr>, State <chr>, deathPlace <chr>, Age <chr>, name <chr>,
## # dfSub <dbl>, dfTot <dbl>, delta <dbl>, pct <dbl>
## # A tibble: 51 × 4
## abb cumValue tot_deaths pctdiff
## <chr> <dbl> <dbl> <dbl>
## 1 AL 20817 NA NA
## 2 AK 1390 NA NA
## 3 AZ 29486 NA NA
## 4 AR 12264 NA NA
## 5 CA 104986 NA NA
## 6 CO 14816 NA NA
## 7 CT 12158 NA NA
## 8 DE 3259 NA NA
## 9 DC 2159 NA NA
## 10 FL 78358 NA NA
## # … with 41 more rows
## # A tibble: 1 × 3
## cumValue tot_deaths pctdiff
## <dbl> <dbl> <dbl>
## 1 1067292 NA NA
## Warning: Removed 51 rows containing missing values (`geom_point()`).
saveToRDS(allCause_230213, ovrWriteError=FALSE)
# STEP 3: Facets for excess all-cause deaths
excessDeathFacets(lstCDC=cdcList_20230213, lstAll=allCause_230213, dateThru="2022-01-31", plotYLim=c(-200, 1200))